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Gibbon Ape Leukemia Virus-based Retroviral Vectors 

5 

BACKGROUND OF THE INVENTION 
The present invention relates generally to 
retroviral vectors. In particular, the invention relates to 
retroviral vectors comprising nucleic acid sequences from 

10 Gibbon Ape Leukemia Virus. 

Considerable effort is now being directed to 
introducing engineered genes into mammalian cells for a 
variety of applications including gene therapy and the 
production of transgenic animals. Such strategies are 

15 dependent upon the development of effective means for safe 
delivery of genes to appropriate target cells and tissues* 

Retroviral vectors are particularly useful for 
directing desired polynucleotides to the appropriate cells and 
integration of the polynucleotides in the host cell genome. 

20 For example, the majority of the approved gene transfer trials 
in the United States rely on replication-defective retroviral 
vectors harboring a therapeutic polynucleotide sequence as 
part of the retroviral genome (Miller et al. Mol. Cell. Biol. 
10:4239 (1990); Kolberg R J. NIH Res. 4:43 (1992); Cornetta et 

25 al. Hum. Gene Ther. 2:215 (1991)). As is known in the art, 
the major advantages of retroviral vectors for gene therapy 
are the high efficiency of gene transfer into certain types of 
replicating cells, the precise integration of the transferred 
genes into cellular DNA, and the lack of further spread of the 

30 sequences after gene transfer. 

Unfortunately, many human cells are not efficiently 
infected by prior art retroviral vectors. Reduced 
susceptibility to retroviral infection is most likely due to 
inefficiencies in one of three stages of viral replication: 

35 1) binding to retroviral receptors on the cell surface and 

early viral entry, 2) late entry and transport of the viral 
genome to the cell nucleus and integration of the viral genome 
into the target cell DNA, and 3) expression of the viral 



WO 94/23048 PCT/US94/03784 

2 

genome. These three stages are governed, respectively, by the 
viral envelope proteins, the viral core proteins, and the 
viral genome. All three of these components must function 
efficiently in a target cell to achieve optimal therapeutic 
5 gene delivery. 

Gibbon Ape Leukemia Virus (GaLV) uses a cell 
surface internalization receptor that is different from those 
of the available retroviral vectors and thus allows infection 
of cells and tissues normally resistant to retroviral 

10 infection. The human receptor for GaLV has recently been 

cloned and shows a wide cell type and species distribution. 
Johann et al., J. Virol. 66:1635-1640 (1992). Indeed, GaLV 
can infect many mammalian species with the notable exception 
of mouse cells. The same receptor is used by simian sarcoma 

15 associated virus (SSAV) , a strain of GaLV. Sommerfelt et al., 
Virol. 176:58-59 (1990). 

The construction of hybrid virions having GaLV 
envelope proteins has been demonstrated. For instance, Wilson 
et al., J". Virol. 63:2374-2378 (1989), describe preparation of 

20 infectious hybrid virions with GaLV and human T-cell leukemia 
virus retroviral ejiv glycoproteins and the gag and pol 
proteins of the Moloney murine leukemia virus (MoMLV) . In 
addition, Miller et al. , J. Virol. 65:2220-2224 (1991), 
describe construction of hybrid packaging cell lines that 

25 express GaLV envelope and MoMLV gag-pol proteins. 

Existent retroviral vectors capable of infecting 
human cells all contain core and genome components that derive 
from MoMLV. For human cells which are resistant to efficient 
infection by such vectors at any of the three stages noted 

3 0 above, new vectors comprising improved envelope, core or 

regulatory sequences must be designed. Thus, there is a need 
to design retroviral vectors components which can be used to 
introduce genes into human cells not efficiently infected by 
the currently utilized retroviral vectors. The present 

35 invention addresses these and other needs. 
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SUMMARY OF THE INVENTION 
The present invention provides recombinant DNA 
constructs comprising a defective viral genome having a 
polynucleotide sequence of interest and a GaLV component. For 
5 instance, the GaLV component may be a GaLV packaging site 

which directs packaging of the defective viral genome in an 
infectious, replication-defective virion. The packaging site 
typically consists of between about 150 base pairs and about 
1500 base pairs and includes a sequence extending from about 
10 position 200 to about position 1290 of the sequence shown in 
Figure 1. 

The construct may further comprise GaLV regulatory 
sequences which direct expression of the polynucleotide of 
interest • Typically, the regulatory sequences comprise a GaLV 

15 (e.g., GaLV SEATO or GaLV SF) 5 1 or 3 1 LTR promoter. 

The invention also relates to mammalian cells - 
comprising the defective viral genome described above. The 
mammalian cells may be packaging cells, in which case the 
cells will also contain retroviral gag, pol and env genes. 

20 These genes may be derived from MoMLV, GaLV SF or GaLV SEATO. 
Packaging cells conveniently used in the invention include 
PG13 and PA317. 

The invention further provides isolated hybrid 
virions comprising GaLV (e.g., SF or SEATO) envelope proteins 

25 and an RNA genome comprising a polynucleotide sequence of 

interest and a GaLV component. The virions typically contain 
GaLV core proteins. MoMLV core proteins can also be used. 

The invention also provides isolated recombinant DNA 
constructs comprising polynucleotide sequences which encode an 

30 infectious GaLV virion capable of infecting a mammalian cell 

and producing functional viral progeny. The infectious clones 
typically comprise about 97% GaLV SEATO sequences and 3% GaLV 
SF sequences. 

Also disclosed are methods of introducing a 

35 polynucleotide of interest into human cells using the hybrid 
virions described above. The methods are preferably used as 
part of a gene therapy protocol for treating a human patient. 
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DEFINITIONS 

A "hybrid virion" is a virion comprising genome, 
core, and envelope components derived from more than one 
virus. The term specifically includes "pseudovirions" which 
5 historically have been defined as containing the genome from 
one virus and the structural proteins from another - 

A "packaging cell" is a genetically constructed 
mammalian tissue culture cell that produces the necessary 
viral structural proteins required for packaging. The cells 
10 are incapable of producing infectious virions until a 

defective genome is introduced into the cells. The genetic 
material for the viral structural proteins is not transferred 
with the virions produced by the cells, hence the virus cannot 
replicate. 

15 A "replication-defective" virion or retroviral 

vector is one produced by a packaging cell as defined above. 
Such a virion infects a target cell but is incapable of 
producing progeny virions which can infect other cells. 

Two polynucleotides or polypeptides are said to be 

20 "identical" if the sequence of nucleotides or amino acid 

residues in the two sequences is the same when aligned for 
maximum correspondence. Optimal alignment of sequences for 
comparison may be conducted by the local homology algorithm of 
Smith and Waterman Adv. Appl . Math. 2: 482 (1981), by the 

25 homology alignment algorithm of Needleman and Wunsch J . Mol . 
Biol. 48:443 (1970), by the search for similarity method of 
Pearson and Lipman Proc. Natl. Acad. Sex. (U.S.A.) 85: 2444 
(1988) , by computerized implementations of these algorithms 
(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics 

30 Software Package, Genetics Computer Group, 575 Science Dr., 
Madison, WI) , or by inspection. These references are 
incorporated herein by reference. 

The percentage of sequence identity between two 
sequences is determined by comparing two optimally aligned 

35 sequences over a window of comparison of at least 20 

positions. The percentage is calculated by determining the 
number of positions at which the identical nucleic acid base 
or amino acid residue occurs in both sequences to yield the 
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number of matched positions, dividing the number of matched 
positions by the total number of positions in the window of 
comparison (i.e., the window size), and multiplying the result 
by 100 to yield the percentage of sequence identity. 
5 For instance, a preferred method for comparing 

sequences uses the GAP program based on the algorithm of 
Needleman at al., supra. Typically, the default values for 
all parameters are selected. These are gap weight: 5.0, 
length weight: 0.30, average match: 1.0, and average mismatch: 
10 0.0. 

The term "substantial identity" means that a 
polynucleotide or polypeptide comprises a sequence that has at 
least 80% sequence identity, preferably 90%, more preferably 
95% or more, compared to a reference sequence over a 

15 comparison window of about 20 bp to about 2000 bp, typically 

about 50 to about 1500 bp, usually about 350 bp to about 1200. 
The values of percent identity are determined using the GAP 
program, above. 

Another indication that nucleotide sequences are 

20 substantially identical is if two molecules hybridize to each 
other under stringent conditions. Stringent conditions are 
sequence dependent and will be different in different 
circumstances. Generally, stringent conditions are selected 
to be about 5° C lower than the thermal melting point (Tm) for 

25 the specific sequence at a defined ionic strength and pH. The 
Tm is the temperature (under defined ionic strength and pH) at 
which 50% of the target sequence hybridizes to a perfectly 
matched probe. Typically, stringent conditions will be those 
in which the salt concentration is about 0.2 molar at pH 7 and 

30 the temperature is at least about 60°C. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 shows the complete sequence of the GaLV 
SEATO genome, as published in Delassus, et al . (1989) Virol. 
35 173:205-213. 

Figures 2A-2F show the construction of the 
infectious GaLV clone of the invention. 
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Figure 3 shows packagable defective genomes of the 
present invention . 

Figure 4 shows schematic diagrams of plasmids 395, 

558, and 521. 

5 Figure 5 shows schematic diagrams of plasmids 395, 

559 and 537. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 
New hybrid retroviral vectors comprising GaLV 

10 components are provided by the present invention. The tissue 
specificity of the vectors is determined by the viral envelope 
proteins, the viral core proteins, and the viral genome, at 
least one of which is derived from GaLV. The vectors can 
comprise the minimal cis acting sequences (packaging signals) 

15 from GaLV that allow packaging of a defective genome in a 

replication-defective hybrid virion. In addition, the LTR of 
the defective genome can be derived from GaLV. For instance, 
the 3 • LTR region of the hybrid retroviral vector can be 
selected from various GaLV sequences to provide desried tissue 

20 specific expression of the structural genes in the genome. 

Replication-defective retroviral vectors are 
produced when a defective DNA viral genome is introduced into 
a packaging cell line. The defective genome contains the 
sequences required for integration into the target cell 

25 genome, for packaging of the genome into infectious virions, 
as well as those viral sequences required for expression of 
the therapeutic gene or other polynucleotide contained within 
the defective viral genome. The packaging cells comprise the 
gag, pol, and env genes which encode the viral core and 

30 envelope components. These core and envelope proteins 
assemble around the defective genome, thus producing 
retroviral vectors. 

A number of standard techniques are used to ensure 
safety of retroviral vectors. For instance, the defective 

35 genome is introduced into the cell separately from the genes 
encoding the core and envelope components. In this way, 
recombination between the genome and the core and envelope 
genes, which would lead to the packaging of complete viral 
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genomes, is extremely unlikely. The resulting virions should 
therefore not comprise the gag, pal , and env genes and are 
thus replication-defective . Homologous recombination , 
however, between the inserts can lead to the production of 
5 infectious virions. Typically, the packaging cells are 

produced by introducing the gag, pol , and env genes on at 
least two separate plasmids. This scheme effectively prevents 
homologous recombination leading to reconstruction of 
infectious virus because the probability of multiple, 

10 independent homologous recombination events occurring is 
extremely low. 

Retroviral vectors can also be designed to prevent 
synthesis of viral proteins by the integrated defective 
genome. For instance, if a portion of the grag gene is 

15 included to increase packaging efficiency, a stop codon can be 
introduced into the gene to prevent synthesis of gag proteins. 
Miller et al., BioTechnlques 7:982-988 (1989), which is 
incorporated herein by reference* 

In addition, the cells used to make packaging cells 

20 do not possess a cell receptor for GaLV and are thus not 

inf ectable by GaLV. Retroviral vector virions having the GaLV 
envelope therefore cannot reinfect the packaging cells and 
vector spread in the packaging cells is greatly reduced. - 
Suitable packaging cells also have limited or no endogenous 

25 viral sequences. Cell lines for this purpose include the Mus 
dunnl tail fibroblast cell line. This strategy decreases the 
potential for generation of recombinant vectors, which are 
often transmitted with higher efficiency than the parental 
vector . 

30 Finally, replication-defective vectors of the 

invention are particularly safe because GaLV is evolutionarily 
derived from a xenotropic virus of an asian strain of mouse 
and does not appear to be closely related to human pathogenic 
viruses. Thus, in terms of containment, GaLV-based, 

35 replication-defective hybrid virions are as safe as prior art 
murine retroviral vectors and provide a safe vehicle for 
delivery of genes for human gene therapy. 
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The packaging cell lines of the invention can be 
used to provide infectious replication-defective hybrid 
virions for use in gene transfer in humans, hamsters, cows, 
cats, dogs, monkeys, chimpanzees, macaques, primates, and 
5 other species whose cells have host cell receptors for GaLV 
envelope proteins. 

Generally, the nomenclature used hereafter and the 
laboratory procedures in cell culture, molecular genetics, and 
nucleic acid chemistry described below are those well known 

10 and commonly employed in the art. Standard techniques are 
used for recombinant nucleic acid methods, polynucleotide 
synthesis, and cell culture. Generally, enzymatic reactions, 
oligonucleotide synthesis, oligonucleotide modification, and 
purification steps are performed according to the 

15 manufacturers 1 specifications. The techniques and procedures 
are generally performed according to conventional methods in 
the art and various general references which are provided 
throughout this document. A basic text disclosing the general 
methods of use in this invention is Sambrook et al., Molecular 

20 Cloning, A Laboratory Manual, Cold Spring Harbor Publish., 

Cold Spring Harbor, NY 2nd ed. (1989) , which is incorporated 
herein by reference. 

A first step in the synthesis of retroviral vectors 
of the invention is obtaining an infectious GaLV DNA clone. 

25 Proviral DNA from at least three GaLV strains (GaLV SF, GaLV 
SEATO, and SSAV) has been cloned. A GaLV SF clone including 
both ends of the GaLV SF genome and the envelope gene but not 
an intact region of the genome encoding the core proteins is 
reported by Scott et al. Proc. Natl. Acad. Sci. USA 78:4213- 

3 0 4217 (1981) . A partial clone containing the envelope and part 
of the genome but not the region encoding core proteins of 
SSAV is described by Gelman et al. Proc. Natl. Acad. Sci. USA 
78:3373-3377(1981). Finally, Gelman et al. J. Virol. 44:269- 
275 (1982) disclose a partial clone of a third GaLV strain, 

35 SEATO, containing all but 350 bases of the core region of 
GaLV. This clone has been sequenced in its entirety by 
Delassus et al. Virol. 173:205-213 (1989) (see Figure 1). The 
deleted 350 bases were also sequenced but from a PCR fragment 
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generated from viral RNA expressed in a GaLV SF infected cell 
line- The sequence of an integrated form of a GaLV SEATO 
genome is also shown in Seq ID No. 1. All of the above 
references are incorporated herein by reference. 
5 Example 1 describes the construction of an 

infectious GaLV clone comprising sequences from GaLV SEATO and 
GaLV SF. This construction can be used to prepare a number of 
retroviral vectors, as described in detail below. 

Packaging Cells 

10 Packaging cells for use in the present invention may 

be made from any animal cell, such as CHO cells, NIH 3T3 f mink 
lung cells, D17 canine cells, and MDBK cells. One or both of 
the core and envelope components can be encoded by GaLV genes • 
The core and envelope components, however, need not be derived 

15 from the same GaLV strain. Indeed, in some embodiments, the 
core components may be derived from a different species '(e.g. 
MoMLV) . For example, the PG13 murine packaging cell line 
produces virion particles having MoMLV core and GaLV envelope 
particles (see Miller, et al. (1991) *7. Virol. 65:2220-2224). 

20 To prepare a packaging cell line, an infectious 

clone of a desired retrovirus (e.g., GaLV SEATO) in which the 
packaging site (\p) has been deleted is constructed. Cells 
comprising this construct will express all GaLV structural 
proteins but the introduced DNA will be incapable of being 

25 packaged. Alternatively, packaging cell lines can be produced 
by transforming a cell line with one or more expression 
plasmids encoding the appropriate core and envelope proteins. 
In these cells, the gag, pol, and env genes can be derived 
from the same or different retroviruses. 

30 Although certain cells may express the receptor for 

a retroviral vector, the cells may not be efficiently infected 
because of a loss of optimum fit between the receptor and the 
envelope proteins. For example, altered glycosylation 
patterns may inhibit retroviral infection (Wilson et al., J. 

35 Virol. 65:5975-5982 (1991), which is incorporated herein by 
reference) . In addition, retroviruses in the same receptor 
class can exhibit different host ranges due to single amino 
acid differences in target cell receptors. 
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In light of these considerations, it may be 
necessary to modify the envelope proteins of the hybrid 
virions to adjust the host range. The proteins may be 
modified to either allow infection of cells previously 
5 resistant to infection or to prevent infection of non-target 
cells. 

One strategy for modifying envelope proteins is the 
use of an in vitro selection scheme. In this approach, an 
infectious clone of the retrovirus along with a selectable 

10 marker gene is introduced into target cells that are resistant 
to infection. Those cells which have been infected by 
retroviruses comprising mutations allowing infection of the 
cells are then identified by standard reverse transcriptase 
assays of the culture supernatant. The env gene of the 

15 adapted retrovirus is cloned and sequenced and used to 
construct new retroviral vectors capable of efficiently 
infecting the target cell. This strategy is particularly 
useful in isolating variants capable of infecting a number of 
human cells currently resistant to GaLV infection such as 

20 tumor infiltrating lymphocytes, bone marrow cells, stem cells, 
and hepatocytes. 

Alternatively, if the gene encoding the cell 
receptor has been cloned, the gene can be inserted in a cell 
line which does not normally produce the receptor. Variant 

25 retroviruses capable of binding the receptor can then be 
identified in the same manner as described above. For 
instance, the human GaLV cell surface receptor has been cloned 
and sequenced. U.S. Patent No. 5,151,361, and Johann et al., 
J. Virol. 66:1635-1640 (1992), which are incorporated herein 

30 by reference. Thus, this gene can be used to identify new 
retroviral vectors expressing modified envelope proteins. 

A third alternative to modifying the host range of a 
retrovirus vector is by directly modifying the envelope 
proteins. Modifications of the sequences encoding the 

3 5 polypeptides may be readily accomplished by a variety of well- 
known techniques, such as site-directed mutagenesis (see, 
e.g., Gillman and Smith, Gene 8:81-97, (1979) and Roberts, S. 
et al., Nature 328:731-734, (1987), which are incorporated 
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herein by reference). The effect of the modifications are 
evaluated by screening for the ability of the engineered 
virions to infect a target cell. 

In addition, specific polynucleotide sequences 
5 encoding desired polypeptides can be fused to the env gene 

using methods known to those skilled in the art* Gene fusions 
comprising sequences encoding antibodies, SCF, IL-6 
somatostatin and the like can thus be used as a targeting 
means. The fused gene can be inserted into an appropriate 

10 plasmid for transformation into the packaging cells. 

In addition, the envelope protein can be modified 
for example, by introducing point mutations in the protein to 
yield moieties for coupling by organic chemical means (e.g., 
insertion of a cysteine residue to give a sulfhydryl group). 

15 Cell-specific targeting moieties can be coupled with 

glutaraldehyde, periodate, or maleimide compounds, or by other 
means known to those skilled in the art. Such couplings may 
also be made directly to wild-type or unmodified envelope 
proteins where coupling can be to a carbohydrate moiety, a 

20 sulfhydryl group, an amino group, or other group which may be 
available for binding* 

A number of packaging cell lines suitable for the 
present invention are also available in the prior art. cThese 
lines include Crip and GPE-Am. Preferred existing cell lines 

25 include PA317 (ATCC CRL 9078) which expresses MoMLV core and 
envelope proteins and PG13 (ATCC CRL 10,683) which produces 
virions having MoMLV core and GaLV envelope components. (See 
Miller et al. J. Virol. 65:2220-2224 (1991), which is 
incorporated herein by reference.) The PG13 packging cell 

30 line can be used in conjunction with the 521 plasmid and the 
537 plasmid, both of which contain 5 f MoMLV LTR and packaging 
signal sequences (see Example 3, herein). 

Defective Genomes 
The other component of retroviral vectors is a 

35 packagable defective genome comprising a polynucleotide 
sequence, typically a structural gene, of interest. The 
defective genomes of the invention include a GaLV component 
which include minimal GaLV nucleotide sequences must be 
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present: in the defective genome itself for the genome to 
integrate in the target cell genome and be packaged in 
infectious virions (i.e. the sequences are required in cis) . 
Thus, the GalV component of the defective genomes of the 
5 invention may include the packaging site, ^, and/or the long 
terminal repeated sequences (LTRs) . The LTRs are positioned 
at either end of the proviral DNA and contain regulatory 
sequences (e.g., promoters, enhancers and polyadenylation 
sequences) which direct expression of the genes within the 

10 proviral DNA. The polynucleotide sequences of the GaLV 
component may be identical to sequences as shown, for 
instance, in SEQ ID. No l, or may be substantially identical 
to that sequence as defined, above. 

Typically, the proviral regulatory sequences drive 

15 expression of the inserted gene. In those embodiments where 
two inserted genes are included (e.g., a marker gene and the 
gene of interest) it is frequently desirable to include a 
virus internal ribosome entry site (IRES) to increase 
efficiency of expression (Ghattas et al., Mol . Cell. Biol. 

20 11:5848-5859 (1991), which is incorporated herein by 
reference) . 

The promoter operably linked to the gene of interest 
may be constitutive, cell type-specific, stage-specific, 
and/ or modulatable (e.g., by hormones such as 
25 glucocorticoids) . Suitable promoters for the invention 

include those derived from genes such as early SV40, CMV major 
late, adenovirus immediate early, histone H4, 0-actin, MMTV, 
and HSV-TIC. 

Enhancers increase the rate of transcription from 
30 promoters, act on cis-linked promoters at great distances, are 
orientation independent, and can be located both upstream, 
(5»), and downstream, (3*), from the transcription unit. 
Enhancers inducible by hormones and metal ions and found only 
in specific tissues have been described. Proteins synthesized 
35 only in one tissue type, for example, act in and myosin in 

muscle, are frequently regulated by tissue specific enhancers. 
For tissue specific expression of the introduced genes of 
interest used in the retroviral vectors of the present 
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invention, tissue-specific enhancers are of particular 
interest* 

A repetitive 4 5 base pair enhancer element in the U3 
region of the GaLV LTR is important for tissue specific 
5 expression of the introduced genes. This enhancer region is 
present only once in the 3' LTR of GaLV SF but is present 3 
times in the 3 1 LTR of GaLV SEATO. (See Quinn et al . , Mol . 
Cell- Biol. 7 : 2735-2744 , which is incorporated herein by 
reference) . The sequence of the 3 1 LTR of GaLV SEATO with 3 

10 repeats of the 45 bp enhancer region is shown in Seq. ID No. 2. 
Thus, the origin of the 3" GaLV LTR region (from GaLV SEATO 
or GaLV SF) in a retroviral vector can influence the 
expression of the introduced gene in different tissues (see 
Example 4 , herein) . 

15 To ensure efficient expression, 3 1 polyadenylation 

regions must be present to provide for proper maturation of 
the mRNA transcripts. The native 3 1 -untranslated region of 
the gene of interest is preferably used, but the 
polyadenylation signal from, for example, SV40, particularly 

20 including a splice site, which provides for more efficient 
expression, could also be used. Alternatively, the 
3 • -untranslated region derived from a gene highly expressed in 
a particular cell type could be fused with the gene of 
interest. 

25 The retroviral vectors of the invention also contain 

GaLV-based regulatory elements that can direct expression of 
genes contained within the genome in a tissue/cell specific 
manner. In general, the GaLV regulatory elements are more 
efficient than the MoMLV elements in expressing genes in human 

30 cells. In addition, the regulatory sequences from different 
GaLV strains have different cell and tissue specificities. 
For instance, GaLV SF regulatory genes function efficiently in 
primate lymphoid cells (e.g., UCD 144) and GaLV SEATO 
regulatory genes function efficiently in human myeloid cells 

35 (e.g., HL60 cells) , while MoMLV regulatory genes do not. 

Thus, tissue specif icity of the vectors of the invention can 
be modified by selecting the appropriate GaLV strain. Tissue 
specificity of the regulatory genes from various GaLV strains 
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is determined using routine screening techniques well-known to 
those of skill in the art. 

The 5 1 and 3 f LTRs of one retrovirus or GaLV strain 
may be also used in a defective genome derived from another. 
5 For instance, the 3' LTR from SSAV can be substituted for the 
3» LTR of an infectious clone of another GaLV strain. Since 
the U3 region of the 3' LTR is the template for the synthesis 
of the U3 region in both 5 1 and 3' LTRs of the progeny virus, 
the 3* LTR will be duplicated and transferred to the 5 f LTR in 
10 the host cell- In this way optimal expression of the gene of 
interest in the target cell can be achieved. 

In addition, in order to increase efficiency of packaging, the 
5 f LTR from one virus (e.g., MoMLV) can be used in combination 
with the 3* LTR of a second (e.g., GaLV). If the constructs 

15 comprise a MoMLV 5 • LTR and a GaLV 3 1 LTR , they are efficiently 
expressed in murine packaging cells (e.g., PG13) but result in 
proviral DNA comprising promoter sequences from GaLV which 
function more efficiently in human cells. These constructs 
are efficiently packaged in packaging cells such as PG13 

20 because the 5 1 MoMLV LTR drives gene transcription in the 

packaging cells. However, when the packaged retrovial vector 
is infected into an appropriate target cell, the 3 f GaLV 
promoter drives gene transcription (see Example 3, herein). 
Examples of retroviral vectors with MoMLV 5 f LTR's and 

25 packaging signals and 3* GaLV LTR 1 s include plasmids 521 and 

537, described in Example 3, herein. This type of retroviral 
vector has the advantages of both efficient packaging in cell 
lines such as PG13 and higher expression in various target 
cells (see Example 4, herein). 

3 0 The cis-acting packaging sequences used in the 

defective viral genomes may be derived from GaLV SEATO. The 
minimal sequences required for efficient packaging of a GaLV- 
based defective genome are described herein. In particular, 
as shown in detail below, the first 910 to 1290 nucleotides 

35 from the 5» end of the GaLV SEATO genome can direct packaging 
of a defective genome by PG13 and PA317 cells. This result 
also shows that the sequences required for efficient packaging 
from GaLV are recognized by MoMLV core proteins. Thus, hybrid 
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retroviral vectors comprising both GaLV and MoMLV components 
can be conveniently constructed. 

The GaLV SEATO sequences required for packaging of 
the defective genomes include the 5 f LTR and extend to about 
5 position 1290 of the GaLV genome illustrated in Figure l. The 
sequences required for packaging also include the packaging 
site, ^, which is typically defined negatively as a sequence 
which, when deleted from a viral genome, prevents efficient 
packaging of the genome. In the GaLV SEATO genome, \p is 

10 located downstream of the 5 • LTR beginning at about position 
2 00. The site usually comprises at least about 350 bp, 
preferably between about 500 bp and about 1500 bp, more 
preferably about 700 to about 1200 bp. One of skill will 
recognize that minor modifications to the packaging sequence 

15 shown in Figure 1 will not substantially affect the ability of 
the sequence to direct packaging. Thus/ the term "GaLV 
packaging site 1 ' as used herein refers to GaLV DNA sequences, 
or RNA sequences transcribed from them which are capable of 
directing packaging when present in cis in a GaLV genome or 

20 defective genome. The term "GaLV SEATO packaging sites" 

refers to those DNA or RNA sequences substantially identical 
(as determined above) to the disclosed sequences and which are 
functional in the defective GALV genomes of the present r 
invention. 

25 The retroviral vectors of the invention are suitable 

for delivering a variety of polynucleotides to cells, 
including transgenes for augmenting or replacing endogenous 
genes in gene therapy or for the production of transgenic 
animals. Antisense polynucleotides can be used to control 

30 expression of target endogenous genes such as oncogenes. In 
addition, genes encoding toxins can be targeted for delivery 
to cancer cells. Other suitable sequences include those 
encoding growth substances to promote immune responses to 
cancers or infections, soluble factors to modulate receptor 

35 activity, and the like. The inserted polynucleotide of 

interest should be less than about 10 kb, preferably between 
about 7 and 8 kb. 
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In certain embodiments, homologous targeting 
constructs are used to replace an endogenous target gene. 
Methods and materials for preparing such constructs are known 
by those of skill in the art and are described in various 
5 references. See, e.g., Thomas et al. f Cell 51:503 (1987) and 
Capecchi, Science 244:1288 (1989), which are incorporated 
herein by reference. 

Homologous targeting constructs have at least one 
region having a sequence that substantially corresponds to, or 

10 is substantially complementary to, a predetermined endogenous 
target gene sequence (e.g., an exon sequence, an enhancer, a 
promoter, an intronic sequence, or a flanking sequence of the 
target gene) . Such a homology region serves as a template for 
homologous pairing and recombination with substantially 

15 identical endogenous gene sequence (s) . In the targeting of 
transgenes, such homology regions typically flank the 
replacement region, which is a region of the targeting 
trahsgene that is to undergo replacement with the targeted 
endogenous gene sequence. Thus, a segment of the targeting 

20 transgene flanked by homology regions can replace a segment of 
the endogenous gene sequence by double crossover homologous 
recombination . 

In addition, the constructs for both homologous 
targeting and random integration will comprise a selectable 

25 marker gene to allow selection of cells. Frequently, multiple 
selectable marker genes are incorporated, such as in positive- 
negative selection constructs for homologous gene targeting. 

A selectable marker gene expression cassette 
typically comprises a promoter which is operational in the 

30 targeted host cell linked to a structural sequence that 

encodes a protein that confers a selectable phenotype on the 
targeted host cell, and a polyadenylation signal. A promoter 

included in an expression cassette may be constitutive, cell 

i 

type-specific, stage-specific, and/or modulatable (e.g., by 
35 hormones such as glucocorticoids; MMTV promoter), but is 
expressed prior to and/or during selection. 

When the selectable marker is contained in a 
homologous targeting construct, homologous recombination at 
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the targeted endogenous site(s) can be chosen to place the 
selectable marker structural sequence downstream of a 
functional endogenous promoter, and it may be possible for the 
targeting construct replacement region to comprise only a 
5 structural sequence encoding the selectable marker, and rely 
upon an endogenous promoter to drive transcription. 
Similarly, an endogenous enhancer located near a targeted 
endogenous site may be relied on to enhance transcription of 
selectable marker gene sequences in enhancerless constructs. 
10 Suitable selectable marker genes include, for 

example: grpt (encoding xanthine-guanine 

phosphor ibosyltransf erase) , which can be selected for with 
mycophenolic acid; neo (encoding neomycin phosphotransferase) , 
which can be selected for with G418, and DFHR (encoding 

15 dihydrofolate reductase) , which can be selected for with 
methotrexate. Other suitable selectable markers will be 
apparent to those in the art. 

Selection for correctly targeted recombinant cells 
will generally employ at least positive selection, wherein a 

20 selectable marker gene expression cassette encodes and 

expresses a functional protein (e.g., neo or gpt) that confers 
a selectable phenotype to targeted cells harboring the 
endogenously integrated expression cassette, so that, by 
addition of a selection agent (e.g., G418, puromycin, or 

25 mycophenolic acid) such targeted cells have a growth or 

survival advantage over cells which do not have an integrated 
expression cassette. 

Cells harboring the transgene of interest either 
randomly integrated or integrated by homologous recombination 

30 may be further identified using techniques well known in the 
art. For instance, the cells can be screened using Southern 
blotting or the polymerase chain reaction (PCR) . If targeted 
integration is being screened, the oligonucleotide probes or 
PCR primers should bracket recombination junctions that are 

35 formed upon transgene integration at the desired homologous 
site. 

Gene Therapy 
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The retroviral vectors of the invention are 
particularly suitable for delivering polynucleotides to cells 
for gene therapy of a number of diseases. Current strategies 
for gene therapy are reviewed in Friedmann, Science 244:1275 
5 (1989), which is incorporated herein by reference. 

Delivery of the polynucleotide of interest may be 
accomplished in vivo by administration of the vectors to an 
individual patient, typically by systemic administration 
(e.g., intravenous, intraperitoneal, intramuscular, subdermal, 

10 or intracranial infusion). Alternatively, the vectors may be 
used to deliver polynucleotides to cells ex vivo such as cells 
explanted from an individual patient (e.g., tumor-infiltrating 
lymphocytes, bone marrow aspirates, tissue biopsy) or 
universal donor hematopoietic stem cells, followed by 

15 reimplantation of the cells into a patient, usually after 
selection for cells which have incorporated the 
polynucleotide. 

The vectors may be used for gene therapy to treat 
congenital genetic diseases, acquired genetic diseases (e.g., 

20 cancer), viral diseases (e.g., AIDS, mononucleosis, 
herpesvirus infection, cytomegalovirus infection, 
papillomovirus infection) or to modify the genome of selected 
types of cells of a patient for any therapeutic benefit. 
Treatable disorders include hemophilia, thalassmias, ADA 

25 deficiency, familial hypercholesterolemia, inherited 

emphysema, cystic fibrosis, Duchenne»s muscular dystrophy, 
lysosomal storage diseases, Gaucher^ disease, and chronic 
granulomatous disease • 

The vectors of the invention can be used to 

30 introduce polynucleotides into a variety of cells and tissues 
including myeloid cells, bone marrow cells, lymphocytes, 
hepatocytes, fibroblasts, lung cells, and muscle cells. For 
example, polynucleotides conferring resistance to a 
chemotherapeutic agent may be transferred to non-neoplastic 

35 cells, especially hematopoietic cells. Alternatively, 
polynucleotides comprising a toxin gene (e.g., ricin or 
diphtheria toxin) expression cassette or a negative selectable 
marker gene expression cassette may be selectively inserted 
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into neoplastic cells. Expression of the toxin gene or 
negative selection gene (followed by negative selection) 
selectively kills target cells- Polynucleotides which are not 
cytotoxic but which reverse or suppress the neoplastic 
5 phenotype (e.g. antisense inhibition of oncogene expression) 

also may be used to treat cancer, as well. Other uses include 
the introduction of immunomodif iers into bone marrow cells to 
treat cancers. 

Transgenic Animals 

10 As noted above, the vectors of the present invention 

are particularly useful for gene targeting mediated by 
homologous recombination between a targeting polynucleotide 
construct and a homologous chromosomal sequence. In addition 
to gene therapy, such strategies are also useful for the 

15 production of transgenic animals. 

The ability to introduce new genes into the germ 
line of an animal has been extremely valuable for basic 
understanding of gene expression. The improvement of desired 
traits in agricultural or domesticated animals is also 

20 possible using these techniques. For example, potential new 
traits that may be introduced include sterility in meat 
producing strains of cattle, or fertility and milk production 
in dairy cows. Other commercially desirable properties 
include hardiness and rapid weight gain in livestock, or "show 

25 qualities" in domestic animals such as dogs and cats. For a 
review of the genetic engineering of livestock see, Pursel et 
al, science 244:1281 (1989), which is incorporated herein by 
reference- 
Typically, embryonic stem (ES) cells are used as the 

30 transgene recipients. Cells containing the newly engineered 

gene are injected into a host blastocyst, which is reimplanted 
into a recipient female. Some of these embryos develop into 
chimeric animals that possess germ cells partially derived 
from the mutant cell line. By breeding the chimeric animals 

35 it is possible to obtain a new line containing the introduced 
gene. 

The following examples are provided by way of 
illustration, not limitation. 
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Example 1 

rnnstruction of GaLV infectious clone c omprising GaLV SEATO 
and GaLV SF sequences. 

To prepare the GaLV infectious clone, a missing 
5 fragment of about 250 kb from the pol gene of a GaLV SEATO 

clone was replaced with the corresponding sequence from GaLV 
SF. The following steps correspond to the numbered steps 
illustrated in Figures 2A-2F. 

The steps illustrated in Figure 2 A show repair of 

10 pol gene of GaLV-SEATO. 

1 The approximately 8.5 kb permuted GaLV-SEATO 
provirus (pGAS-2 Hdl) from pGAS-2 (Gelman et 
al., 1982, supra) was isolated by Hindlll 
digestion and DEAE-cellulose membrane 

15 interception in an agarose gel. An 

approximately 250 bp GaLV-SF pol gene fragment 
of pGV-3 corresponding to the missing pol 
fragment of PGAS-2 was isolated by Hindlll 
digestion and DEAE-cellulose membrane 

20 interception in an agarose gel. 

2 The two DNA species were ligated at low 
concentration to favor circularization over 
multimer formation. 

3 After ligated material was precipitated, Sal I 
25 restriction was used to linearize the 

construct . 

4 The construct was ligated into Sal I-restricted 
and phosphatased pVZ-1 vector. 

5 DH5aF f cells were transformed. 

3 0 6. Transf ormants were screened by alkaline lysis, 

plasmid mini-preps, and sequencing with "GVGAS 
10" primer to check number and orientation of 
GaLV-SF pol fragment inserts within GaLV-SEATO 
sequence. A clone with correct construction 
35 was named intermediate Clone 66. 

Figure 2B shows change of GaLV-SEATO insert 
orientation . 
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7 The permuted proviral Clone 66 insert was 
isolated by Sal I digestion and DEAE-cellulose 
membrane interception on an agarose gel. 

8 The insert was re-ligated back into pVZ-1 Sal 
5 I-cut and phosphatased vector to obtain 

opposite orientation* The opposite orientation 
clone was named intermediate Clone 120. 
Figures 2C and 2D illustrate the intermediate Clone 
66 and the unidirectional decrease in insert length using 
10 Exonucleases III and VII. 

9 Intra-insert distances were estimated by known 
sequence and accurate restriction mapping. The 
goal was to decrease the 8.5 kb insert by 5.4 
kb, stopping at a point just 3 1 of the LTR-LTR 

15 junction, leaving one LTR intact. The size of 

resulting clone (vector + insert) was ~ 6 kb. 

10 Not I restriction of Clone 66 and Clone 12 0 was 
used to check for absence of intra-insert 
sites. They were found to be absent. Clone 66 

20 was linearized with Not I in the multiple 

cloning site* 

11 The Not I termini were filled in with cold 
dCTP[crS] and dGTP[oeS] and DNA polymerase ! 
(Klenow) . ot-thiodeoxyribonucleotides were used 

25 to block these termini from Exonuclease III 

digestion. 

12 Clone 66 and Clone 120 were restricted with Xba 
I to check for absence of intra-insert sites. 
Clone 66 was restricted with Xba I in the 

30 multiple cloning site generating 5 1 overhang 

cohesive termini. 

13 Precisely timed Exonuclease III digestion 
destroyed the Xba I site but the Sal I site at 
5* insert end was left intact , and incomplete 

35 Not I site was resistant to attack by 

Exonuclease III. 

14 Digestion with Exonuclease VII was used to 
remove remaining single strand. 
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15 The "ragged ends" were filled in with DNA 
polymerase (Klenow) and cold deoxynucleotide 
triphosphates . 

16 The blunt ended incomplete Not I site was 
5 ligated to insert sequence. 

17 DHSaF • cells were transformed. 

18 Transf ormants were screened by alkaline lysis, 
plasmid mini-preps, Sal I linearization and 
sequencing to determine (a) extent of insert 

10 deletion and (b) quality of incomplete Not I 

sites and the true extent of protection given 
by a-thiodeoxyribonucleotides from digestion 
into the vector by Exonuclease III or VII. 

19 Transf ormants were further screened by Not I 
15 digestion, searching for complete Not I site. 

20 Clones that linearize with Not I were 
linearized to confirm presence of complete Not 
I site and accurately determine extent of 
insert deletion. One clone with desired 

20 digestion to a point just 3 • of the LTR-LTR 

junction and with a complete Not I site, was 
named intermediate Clone 66Exo52. 
Figure 2E shows the intermediate Clone 120: 
Unidirectional decrease in insert length using Exonucleases 

25 III and VII. 

21 Intra-insert distance was estimated by known 
sequence and accurate restriction mapping. The 
goal was to decrease the 8.5 kb insert by 2.6 
kb, stopping at a point just 3 1 of the LTR-LTR 

30 junction leaving one LTR intact. Size of 

resulting clone was - 9 kb. 

22 to 32 The steps were preformed as described for steps 

10-20. One clone with desired digestion to a 
point just 3 1 of the LTR-LTR junction and with 
35 a complete Not I site, was named Intermediate 

Clone 120Exo55. 
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Figure 2F shows coupling of Clone 66Exo52 insert and 
Clone 12 0Exo55 insert: separation of LTR 1 s and generation of 
infectious clone. 

33 Double digestion of both Clone 66Exo52 and 

5 Clone 12 0Exo55 with Sal I and Not I was used to 

release inserts. 

34 Inserts were isolated by DEAE cellulose 
membrane interception in agarose gels. 

35 Ligation of Clone 66Exo52 insert, Clone 
10 120Exo55 insert and Not I restricted 

pVZ-vector. 

36 DH5aF 1 cells were transformed. 

37 Screening of transf ormants by 32 P-labelled 
probing of colonies, alkaline lysis plasmid 

15 mini-preps, restriction analysis and sequencing 

to search for potential infectious clones with 
correct construction* 

38 Large scale plasmid preparation and restriction 
mapping of GaLV-SEATO infectious clone. 



20 



The resulting cloned GaLV genome was subsequently 
shown to encode infectious GaLV virions. 
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Example 2 

Construction of defective genomes comprising GaLV SF and GaLV 
SEATO packaging sites. 

The steps used to prepare a defective genome 
3 0 comprising GaLV SEATO sequences from the infectious clone in 
Example 1 were as follows. 

1. A 1667 bp Not I-Bgl II fragment from the 5 1 end of the 
infectious clone of GaLV SEATO was isolated. 

2. A 3116 bp Bam Hl-Xba I fragment corresponding to the 
35 Lac Z gene was isolated from the pl203 Lac Z plasmid 

(Ghattas et al. , supra). 
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3. A 596 bp Xba I to Hind III fragment corresponding to 
the ECMV IRES (EMCV internal ribosome entry site) was 
isolated from pLZIC2 (Ghattas et al., supra). 

4. A 890 bp Stu I- Sfu I fragment corresponding to the 
5 G418 resistance gene was isolated from pRcCMV plasmid 

(Invitrogen) . 

5. A 995 bp Stu I -Not I fragment corresponding to the 3' 
end of the GaLV SEATO infectious clone was isolated* 

6. A linearized Not I pGem 13 plasmid (Promega, 3181bp) 
10 was isolated, 

7 • These fragments were ligated together to assemble the 
pGaLV SEATO 395 plasmid. 

Figure 3 (top) shows the resulting defective genome. 

15 Figure 3 (middle) shows a defective genome constructed in the 
same manner but using a Not I-Nco I fragment from the 5' end 
of the GaLV SEATO genome. Figure 3 (bottom) shows a construct 
prepared from GaLV SF sequences. 

The pGaLV SEATO 3 95 plasmid was further modified by 

20 increasing the length of the 5» putative packaging region by 
328 bp in creating the GaLV SEATO 558 construct. Plasmid 558 
this represents a modified 395 plasmid which contains an 
additional 328 nucleotides of 5 9 GaLV SEATO sequences 
extending to the Bgl II site at position 1290 of the GaLV 

25 genome. (Plasmid 395 extends only to the Nco I site at 
position 910 of the GaLV genome.) The 558 plasmid 
construction was made using the 194 GaLV SF plasmid. The GaLV 
SF 194 plasmid contains a truncated GaLV SF genome cloned into 
the Promega pSP72 genome at the Eco RI site. 

30 The steps in construction of the 558 plasmid are 

listed below. 

1. A Pst I- Bgl II fragment of GaLV SEATO containing the 
5 • GaLV SEATO LTR and the GaLV SEATO packaging site was 
used to replace the corresponding region of the GaLV SF 

35 194 plasmid partial genome. 

2. A Barn Hl-Xba I fragment containing the bacterial Lac 
Z gene but lacking an initiation codon was ligated, in 
reading frame, to the Bgl II site such that the Lac Z 
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gene initiated from the GaLV SEATO gag protein 
translation start codon. Therefore the 0-galactosidase 
protein is a GaLV SEATO gag-Lac Z fusion protein. 

3. An Xba Ito Nsi I fragment containing the EMCV IRES 
5 and a G418 gene was ligated to the Xba I site downstream 

of the Lac 2 gene and the Nsi I in the 3 • region of the 
GaLV SF 194 genome. 

4. The Nsi I- Sma I region at the 3* end of the 194 GaLV 
SF genome was replaced with a corresponding region of 

10 GaLV SEATO, such that the 3' U3 of the LTR contained GaLV 

SEATO derived sequences in place of the GaLV SF 194 
sequences. 

The schematic diagrams of plasmids 395 and 558 are 
compared in figure 4 and the nucleotide sequence of plasmid 
15 558 is shown in Seq. ID No. 3. 

Example 3 

Construction of GALV defective genomes with improved packaging 
efficiency in murine packaging cell lines that express MoMLV 
structural proteins 

20 In order to improve the efficiency of packaging in 

murine packaging cell lines such as PG13 and PA317, which 
express MoMLV structural proteins, we constructed GALV 
defective gemomes that have a MoMLV promoter at the 5 • end and 
a GaLV promoter at the 3 ■ end . 

25 Two defective genomes, designated plasmid 521 and 

plasmid 537, having a MoMLV promoter at the 5 1 end and a GaLV 
promoter at the 3 1 end, were constructed. In order to 
construct plasmid 521, the 5 1 end of the 395 plasmid (Sfi 
1/ filled in-Cla I) was replaced with the corresponding 

30 fragment of a similar MoMLV-based Lac 2 genome (Sst II/filled 
in to Cla I). In order to construct plasmid 537, the 3 1 Nsi 
I- Not I (filled in) fragment of 521 was replaced with Nsi-Bgl 
II (filled in) fragment of GaLV SF 194. 

For comparative purposes, a MoMLV defective genome 

35 plasmid similar in construction to the 521 plasmid, was 
prepared by replacing the Spe I- Sph I fragment of pLXSN 
(which contains the end of the MoMLV packaging region, the 
SV40 promoter and the 5' part of the G418 gene with the 
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corresponding region (also an Spe I-Sph fragment) of the 521 
genome, thereby replacing the SV4 0 promoter with an IRES 
element. This defective genome is designated plasmid 560. 
Plasmids 521, 537, and 560 are shown schematically in figures 
5 4 and 5. The nucleotide sequence of plasmid 521 is shown in 
Seq. ID No. 4 and the nucleotide sequence of plasmid 537 is 
shown in Seq. ID No. 5. 

The 521 and 537 plasmid constructs provide a means 
of optimizing gene expression in the packaging cells while 

10 retaining GaLV-driven gene expression in target cells where 
GaLV promoters function more efficiently then the MoMLV 
promoter. Because the 521 and 537 constructs have a MoMLV 
promoter at the 5» end, cells transfected with these 
constructs (such as packaging cells PA317 and PG 13) have a 

15 MoMLV promoter (U3) driving gene transcription. On the other 
hand, when the genome is reverse transcribed after infection 
of the target cell, the GaLV U3 promoter in the 3« LTR is 
duplicated and replaces the MoMLV promoter at the 5 1 end. 
This has been demonstrated by sequence analysis of 

20 unintegrated vector DNA from 521 target cells (data not 
shown) . The DNA from these cells infected with the 521 
construct after packaging in either PG13 or PA317 cells 
contains a 5» AND 3' GaLV SEATO U3 (data not shown)* 
Therefore the 5» end of the 521 genome switches from a MoMLv 

25 U3 to a GaLV SEATO U3 in infected cells, which results in 
GaLV-driven gene expression in target cells. 

Example 4 

Effect of the number of 4 5 bp enhancer ele ments in the U3 
region of the GaLV LTR on efficiency of gene expression in 

30 target cells 

There are a variable number of repetitive 45 bp 
enhancer elements in the U3 region of the GaLV LTR. The 558 
plasmid and the 521 plasmid U3 regions, derived from GaLV 
SEATO, each contain 3 repetitive 45 bp enhancer elements, 

35 whereas GaLV SF (eg. plasmids 537 and 559) has only one of 

these elements. The number of repeats may play a restrictive 
but potentially useful role in governing expression of 
downstream genes in different target cells. The experimental 
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data presented below suggests that the number of repetitive 4 5 
bp enhancer elements in the U3 region of the LTR of GaLV can 
effect the efficiency of tissue/cell specific gene expression. 

Following transfection of the 521 , 537 or 560 
plasmids into the PA317 or PGI3 cell lines, the MoMLV 
promoters are used to express packagable genomes. For the 521 
and 537 plasmids, however, the GaLV promoter is used to 
express 0-galactosidase and G418 resistance in the target cell 
after infection with the packaged defective genomes. The 
effect of three repeats of the 45 basepair enhancer region 
versus only one copy of the enhancer region in the GaLV 
promoter is shown in the table below. The expression of the 
G418 indicator gene is measured by titering G418 resistant 
colonies. The data in the table below demonstrates the effect 
of varying the number of 45 bp enhancer region repeats on the 
expression of genes driven by the GaLV LTRs in different cell 
types (see table) . 
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Table : Efficiency of Gene Expression Directed by Retroviral 
Vectors in Various Target Cells 



genome 

packaging cells: 

promoter used 
target cells: 

mink fibroblasts 
murine NIH 3T3 cells 
BHK hamster cells 
HaK hamster cells 
Bovine MDBK cells 



537 
PGI3 
GaLV SF 



2X10 
5x10^ 



5X10" 



2# 



558 
PGI3 

GaLV SEATO 

5X10 4 
5.0 

0.5X10 
0.5X10 2 
5X10 4 
-* 



560 

PGI3 

MoMLV 

5.0 
5X10 4 



5x10' 
— * 



Human KB cells 
Human HeLA cells 
35 Human 293 cells 



5x10* 
5xl0 4 
5x10 
5X10 



3* 



5x10^ 
5X10 2 
5X10 2 
5X10* 



5X10 
5X10 2 
5x10^ 
5xl0 : 



# 



titer expressed as number of G418 resistant colonies 
obtained with I ml of PGI3 or PA317 supernatant 



WO 94/23048 PCT/US94/03784 

28 

containing retroviral vectors with either the 537, 558 or 
560 genomes 
* genomes packaged in PA317 cells 



Although the present invention has been described in 
some detail by way of illustration and example for purposes of 
clarity and understanding, it will be apparent that certain 
changes and modifications may be practiced within the scope of 
10 the appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: The United States of America, 

as represented by 

The Secretary of the Department 

of Health and Human Services 

(B) STREET: 6011 Executive Blvd., Suite 325 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: U.S.A. 

(F) POSTAL CODE (ZIP): 20852 

(G) TELEPHONE: (301) 496-7056 

(H) TELEFAX: (301) 402-0220 

(I) TELEX: 

(ii) TITLE OF INVENTION: Gibbon Ape Leukemia Virus-Based 
Retroviral Vectors 



(iii) NUMBER OF SEQUENCES: 5 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1*25 



(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: WO not yet assigned 

(B) FILING DATE: 06-APR-1994 

(C) CLASSIFICATION: 

(vii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Bastian, , Kevin L. 

(B) REGISTRATION NUMBER: 34,774 

(C) REFERENCE / DOCKET NUMBER: 15280-128- IPC 



(ix) 



TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 543-9600 

(B) TELEFAX: (415) 543-5043 
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<2) INFORMATION FOR SEQ ID NO:l: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

( ix ) FEATURE : 

(A) NAME/KEY: misc_f eature 
<B) LOCATION: 1. .8535 

(D) OTHER INFORMATION: /standard_name= "GaLV SEATO Genome" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



AATGAAAGAA 


G TG T XTTTTT 


TTAGCCAACT 


GCAGTAACGC 


CATTTTGCTA 


GGCACACCTA 


60 


AAGGATAGGA 


AAAATACAGC 


TAAGAACAGG 


GCCAAACAGG 


ATATCTGTGG 


TCATGCACCT 


120 


GGGCCCCGGC 


CCAGGCCAAG 


GACAGAGGGT 


TCCCAGAAAT 


AGATGAGTCA 


ACAGCAGTTT 


180 


CCAGCAAGGA 


CAGAGGGTTC 


CCAGAAATAG 


ATGAGTCAAC 


AGCAGTTTCC 


AGGGTGCCCC 


240 


TCAACCGTTT 


CAAGGACTCC 


CATGACCGGG 


AATTCACCCC 


TGGCCTTATT 


TGAACTAACC 


300 


AATTACCTTG 


CCTCTCGCTT 


CTGTACCCGC 


GCTTTTTGCT 


ATAAAAATAA 


GCTCAGAAAC 


360 


TCCACCCGGA 


GCGCCAGTCC 


TTAGAGAGAC 


TGAGCCGCCC 


GGGTACCCGT 


GTGTCCAATA 


420 


AAACCTCTTG 


CTGATTGCAT 


CCGGAGCCGT 


GGTCTCGTTG 


TTCCTTGGGA 


GGGTTTCTCC 


480 


TAACTATTGA 


CCGCCCACTT 


CGGGGGTCTC 


ACATTTGGGG 


GCTCGTCCGG 


GATCGGAAAC 


540 


CCCACCCAGG 


GACCACCGAC 


CCACCAACGG 


GAGGTAAGCT 


GGCCAGCGAC 


CGTTGTGTGT 


600 


CTCGCTTCTG 


TGTCTAAGTC 


CGTAATTCTG 


ACTGTCCTTG 


TGTGTCTCGC 


T1CTGTGTCT 


660 


GAGACCGTAA 


CTCTGACTGC 


CCTTGTAAGT 


GCGCGCATTT 


TTTTGGTTTC 


AGTCTGTTCC 


720 


GGGTGAATCA 


CTCrGCGAGT 


GACGTGTGAG 


TAGCGAACAG 


ACGTGTTCGG 


GGCTCACCGC 


780 


CTGGTAATCC 


AGGGAGACGT 


CCCAGGATCA 


GGGGAGGACC 


AGGGACGCCT 


GGTGGACCCC 


840 


TCGGTAACGG 


GTCGTTGTGA 


CCCGATTTCA 


TCGCCCGTCT 


GGTAAGACGC 


GCTCTGAATC 


900 


TGATTCTCTC 


TCTCGGTCGC 


CTCGCCGCCG 


TCTCTGGTTT 


CTTTTTGTTT 


CGTTT CTGGA 


960 


AAGCCTCTGT 


GTCACAGTCT 


TTCTCTCCCA 


AATCATCAAT 


ATGGGACAAG 


ATAATTCTAC 


1020 


CCCTATCTCC 


CTCACTCTAA 


ATCACTGGAG 


AGATGTGAGA 


ACAAGGGCTC 


ACAATCTATC 


1080 


CGTGGAAATC 


AAAAAGGGAA 


AATGGCAGAC 


TTTCTGTTCC 


TCCGAGTGGC 


CCACATTCGG 


1140 


CGTGGGGTGG 


CCACCGGAGG 


GAACTTTTAA 


TCTCTCTGTC 


ATTTTTGCAG 


TTAAAAAGAT 


1200 


TGTCTTTCAG 


GAGAACGGGG 


GACATCCGGA 


CCAAGTTCCA 


TATATCGTGG 


TATGGCAGGA 


1260 


CCTCGCCCAG 


AATCCCCCAC 


CATGGGTGCC 


AGCCTCCGCC 


AAGGTCGCTG 


TTGTCTCTGA 


1320 


TACCCGAAGA 


CCAGTTGCGG 


GGAGGCCATC 


AGCTCCTCCC 


CGACCCCCCA 


TCTACCCGGC 


1380 


AACAGACGAC 


TTACTCCTCC 


TCTCTGAACC 


CACGCCCCCG 


CCCTATCCGG 


CGGCACTGCC 


1440 


ACCCCCTCTG 


GCCCCTCAGG 


CGATCGGACC 


GCCGTCAGGC 


CAGATGCCCG 


ATAGTAGCGA 


1500 
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TCCTGAGGGG CCAGCCGCTG GGACCAGGAG TCGCCGTGCC CGCAGTCCAG CAGACAACTC 
GGGTCCTGAC TCCACTGTGA TTTTGCCCCT CCGAG CCAT A GGACCCCCGG CCGAGCCCAA 
TGGCCTGGTC CCT CTACAAT ATTGG CCTTT TTCCTCAGCA GATCTTTATA ATTGGAAATC 
TAATCATCCC TCTTTTTCTG AAAACCCAGC AGGTCTCACG GGGCTCCTTG AGTCTCTTAT 
GTTCTCCCAT CAGCCCACTT GGGACGATTG CCAACAGCTC CTACAGATTC TTTTCACCAC 
TGAGGAACGG GAAAGAATTC TCCTGGAGGC CCGCAAAAAT GTCCTTGGGG ACAATGGGGC 
CCCTACACAG CTCGAGAACC TCATTAATGA GGCCTTCCCC CTCAATCGAC CTCACTGGGA 
TTACAACACA GCCGCAGGTA GGGAGCGTCT TCTGGTCTAC CGCCGGACTC TAGTGGCAGG 
TCTCAAAGGG GCAGCTCGGC GTCCTACCAA TTTGGCTAAG GTAAGAGAGG TCTTGCAGGG 
ACCGGCAGAA CCCCCTTCGG TTTTCTTAGA ACGCCTGATG GAGGCCTATA GGAGATACAC 
TCCGTTTGAT CCCTCTTCTG AGGGACAACA GGCTGCGGTC GCCATGGCCT TTATCGGACA 
GTCAGCCCCA GATATCAAGA AAAAGTTACA GAGGCTAGAG GGGCTCCAGG ACTATTCCTT 
ACAAGATTTA GTAAAAGAGG CAGAAAAGGT GTACCATAAG AGAGAGACAG AAGAAGAAAG 
ACAAGAAAGA GAAAAAAAGG AGGCAGAAGA AAAGGAGAGG CGGCGCGATA GGCCGAAGAA 
AAAAAACTTG ACTAAAATTC TGGCCGCAGT AGTAAGTAGA GAAGGGTCCA CAGGTAGGCA 
GACAGGGAAC CTGAGCAACC AGGCAAAGAA GACACCTAGG GATGGAAGAC CTCCACTAGA 
CAAAGACCAG TGCGCATACT GTAAAGAGAA GGGCCATTGG GCAAGAGAAT GTCCCCGAAA 
AAAACACGTC AGAGAAGCCA AGGTTCTAGC CCTAGATAAC TAGGGGAGTC AGGGTTCGGA 
CCCCCTCCCC GAACCTAGGG TAACACTGAC TGTGGAGGGG ACCCCCATTG AGTTCCTGGT 
CGACACCGGA GCTGAACATT CAGTATTGAC CCAACCCATG GGAAAAGTAG GGTCCAGACG 
GACGGTCGTG GAAGGAGCGA CAGGCAGCAA GGTCTACCCC TGGACCACAA AAAGACTTTT 
AAAAATTGGA CATAAACAAG TGACCCACTC CTTCCTGGTC ATACCCGAGT GCCCTGCTCC 
TCTGTTGGGC AGGGACCTCC TAACCAAACT AAAGGCCCAG ATCCAGTTTT CCGCTGKGGG 
CCCACAGGTA ACATGGGGAG AACGCCCTAC TATGTGCCTG GTCCTAAACC TGGAAGAAGA 
ATACCGACTA CATGAAAAGC CAGTACCCTC CTCTATCGAC CCATCCTGGC TCCAGCTTTT 
CCCCACTGTA TGGGCAGAAA GAGCCGGCAT GGGACTAGCC AATCAAGTCC CACCAGTGGT 
AGTAGAGCTA AGATCAGGTG CCTCACCAGT GGCTGTTCGA CAATATCCAA TGAGCAAAGA 
AGCTCGGGAA GGTATCAGAC CCCACATCCA GAAGTTCCTA GACCTAGGGG TCTTGGTGCC 
CTGTCGGTCG CCCTGGAATA CCCCTCTGCT ACCTGTAAAA AAGCCAGGGA CCAATGACTA 
TCGGCCAGTT CAAGACCTGA GAGAAATTAA TAAAAGGGTA CAGGATATTC ATCCCACAGT 
CCCAAACCCT TACAATCTTC TGAGTTCCCT TCCGCCTAGC TATACTTGGT ACTCAGTCTT 
AGATCTCAAG GATGCCTTTT TCTGCCTCAG GCTACATCCC AACAGCCAGC CGCTGTTCGC 
GTTCGAGTGG AAAGACCCAG AAAAAGGTAA CACAGGTCAG CTGACCTGGA CGCGGCTACC 
ACAAGGGTTC AAGAACTCTC CCACTCTCTT CGACGAGGCC CTCCACCGAG ATTTGGCTCC 
CTTTAGGGCC CTCAACCCCC AGGTGGTGTT ACTCCAATAT GTGGACGACC TCTTGGTGGC 
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CGCCCCCACA 


TATGAAGACT 


GCAAAAAAGG 


AACACAGAAG 


CTCTTACAGG 


AGTTAAGTAA 


3660 


GTTGGGGTAC 


CGGGTATCGG 


CTAAGAAGGC 


CCAGCTCTGC 


CAGAGAGAAG 


TCACCTATCT 


3720 


GGGGTACCTA 


CTCAAGGAAG 


GAAAAAGATG 


GCTAACCCCA 


GCCCGAAAGG 


CTACTGTTAT 


3780 


GAAAATCCCT 


GTTCCTACGA 


CCCCCAGACA 


GGTCCGTGAA 


TTTCTAGGCA 


CTGCCGGATT 


3840 


CTGCAGGCTC 


TGGATCCCTG 


GGTTTGCTTC 


CCTGGCTGCA 


CCCTTGTACC 


CCCTAACAAA 


3900 


AGAGAGCATC 


CCTTTTATTT 


GGACTGAGGA 


ACATCAGCAG 


GCTTTTGACC 


ACATAAAAAA 


3960 


AGCCTTGCTG 


TCAGCCCCTG 


CATTGGCCCT 


CCCAGACCTC 


ACCAAGCCAT 


TCACTCTATA 


4020 


TATAGATGAG 


AGAGCCGGCG 


TGG CCCGGGG 


AGTGCTCACT 


CAGACTTTAG 


GACCCTGGCG 


4080 


GCGGCCAGTA 


GCATATCTAT 


CAAAAAAACT 


GGATCCGGTG 


GCCAGCGGGT 


GGCCAACCTG 


4140 


CCTGAAAGCG 


GTTGCAGCAG 


TAGCACTCCT 


TCTCAAAGAC 


GCTGATAAGT 


TAACCTTGGG 


4200 


ACAAAATGTG 


ACTGTGATTG 


CTTCCCATAG 


CCTCGAAAGC 


ATCGTGCGGC 


AACCCCCCGA 


4260 


CCGGTGGATG 


ACCAATGCCA 


GAATGACTCA 


TTACCAGAGC 


CTGCTGTTAA 


ATGAAAGGGT 


4320 


ATCGTTTGCG 


CCCCCTGCTG 


TCCTAAACCC 


AGCTACCCTA 


CTTCCAGTCG 


AGTCGGAAGC 


4380 


CACCCCAGTG 


CACAGGTGCT 


CAGAAATCCT 


CGCCGAAGAA 


ACTGGAACTC 


GACGAGACCT 


4440 


AGAAGACCAA 


CCATTGCCCG 


GGGTGCCAAC 


CTGGTATACA 


GACGGTAGCA 


GTTTCATCAC 


4500 


GGAAGGTAAA 


CGGAGAGCAG 


GGGCCCCGAT 


CGTAGATGGC 


AAGCGGACGG 


TATGGGCTAG 


4560 


CAGCCTGCCA 


GAAGGTACGT 


CAGCCCAGAA 


GGC7GAACTA 


GTAGCCTTGA 


CGCAGGCATT 


4620 


ACGCCTGGCC 


GAAGGAAAAA 


ACATCAACAT 


CTACACGGAC 


AGCAGGTATG 


CTTTTGCGAC 


4680 


TGCTCATATT 


CATGGGGCAA 


TATATAAGCA 


GAGGGGGCTG 


CTCACTT CTG 


CTGGAAAAGA 


4740 


TATCAAAAAC 


AAAGAGGAAA 


TTTTGGCCCT 


GCTAGAGGCC 


ATCCATCTCC 


CTAGGCGGGT 


4800 


CGCCATTATC 


CACTGTCCTG 


GCCACCAGAG 


GGGAAGTAAC 


CCTGTGGCCA 


CTGGGAACCG 


4860 


GAGGGCCGAC 


GAGGCTGCAA 


AGCAAGCCGC 


CCTGTCGACC 


AGAGTGCTGG 


GAGGAACTAC 


4920 


AAAACCTCAA 


GAGCCAATCG 


AGCCCGCTCA 


AGAAAAGACC 


AGGCCGAGGG 


AGCTGACCCC 


4980 


TGACCGGGGA 


AAAGAATTCA 


TTAAGCGGTT 


ACATCAGTTA 


ACTCACTTAG 


GACCAGAAAA 


5040 


GCTTCTCCAA 


CTAGTGAACC 


GTACCAGCCT 


CCTCATCCCG 


AACCTCCAAT 


CTGCAGTTCG 


5100 


CGAAGTCACC 


AGTCAGTGTC 


AGGCTTGTGC 


CATGACTAAT 


GCGGTCACCA 


CCTACAGAGA 


5160 


GACCGGAAAA 


AGGCAACGAG 


GAGATCGACC 


CGGCGTGTAC 


TGGGAGGTAG 


ACTTCACAGA 


5220 


AATAAAGCCT 


GGTCGGTATG 


GAAACAAGTA 


TCTGTTAGTA 


TTCATAGATA 


CTTTCTCCGG 


5280 


ATGGGTAGAA 


GCTTTTCCTA 


CCAAAACTGA 


AACGGCCCTA 


ATCGTCTGTA 


AAAAAATATT 


5340 


AGAAGAAATT 


CTACCCCGCT 


TCGGGATCCC 


TAAGGTACTC 


GGGTCAGACA 


ATGGCCCGGC 


5400 


CTTTGTTGCT 


CAGGTAAGTC 


AGGGACTGGC 


CACTCAACTG 


GGGATAAATT 


GGAAGTTACA 


5460 


TTGTGCGTAT 


AGACCCCAGA 


GCTCAGGTCA 


GGTAGAAAGA 


ATGAACAGAA 


GAATinAAGA 


5520 


GACCTTGACC 


AAATTAGCCT 


TAGAGACCGG 


TGGAAAAGAC 


TGGGTGACCC 


TCCTTCCCTT 


5580 


AGCGCTGCTT 


A(yvv wW-rVw-vjrV 


ATAfCCCTGG 


CCGGTTTGGT 


TTAACTCCTT 


ATGAAATTCT 


5640 


CTATGGAGGA 


CCACCCCCCA 


TACTTGAGTC 


TGGAGAAACT 


TTGGGTCCCG 


ATGATAGATT 


5700 
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TCTCCCTGTC TTATTTACTC ACTTAAAGGC TTTAGAAATT GTAAGGACCC AAATCTGGGA 
CCAGATCAAA GAGGTGTATA AGCCTGGTAC CGTAACAATC CCTCACCCGT TCCAGGTCGG 
GGATCAAGTG CTTGTCAGAC GCCATCGACC CAGCAGCCTT GAGCCTCGGT GGAAAGGCCC 
ATACCTGGTG TTGCTGACTA CCCCGACCGC GGTAAAAGTC GATGGTATTG CTGCCTGGGT 
CCATGCTTCT CACCTCAAAC CTGCACCACC TTCGGCACCA GATGAGTCCT GGGAGCTGGA 
AAAGACTGAT CATCCTCTTA AGCTGCGTAT TCGGCGGCGG CGGGACGAGT CTGCAAAATA 
AGAACCCCCA CCAGCCCATG ACCCTCACTT GGCAGGTACT GTCCCAAACT GGAGACGTTG 
TCTGGGATAC AAAGGCAGTC CAGCCCCCTT GGACTTGGTG GCCCACACTT AAACCTGATG 
TATGTGCCTT GGCGGCTAGT CTTGAGTCCT GGGATATCCC GGGAACCGAT GTCTCGTCCT 
CTAAACGAGT CAGACCTCCG GACTCAGACT ATACTGCCGC TTATAAGCAA ATCACCTGGG 
GAGCCATAGG GTGCAGCTAC CCTCGGGCTA GGACTAGAAT GGCAAGCTCT AC CTT CTACG 
TATGTCCCCG GGATGGCCGG ACCCTTTCAG AAGCTAGAAG GTGCGGGGGG CTAGAATCCC 
TATACTGTAA AGAATGGGAT TGTGAGACCA CGGGGACCGG TTATTGGCTA TCTAAATCCT 
CAAAAGACCT CATAACTGTA AAATGGGACC AAAATAGCGA ATGGACTCAA AAATTTCAAC 
AGTGTCACCA GACCGGCTGG TGTAACCCCC TTAAAATAGA TTTCACAGAC AAAGGAAAAT 
TATCCAAGGA CTGGATAACG GGAAAAACCT GGGGATTAAG ATTCTATGTG TCTGGACATC 
CAGGCGTACA GTTCACCATT CGCTTAAAAA TCACCAACAT GCCAGCTGTG GCAGTAGGTC 
CTGACCTCGT CCTTGTGGAA CAAGGACCTC CTAGAACGTC CCTCGCTCTC CCACCTCCTC 
TTCCCCCAAG GGAAGCGCCA CCGCCATCTC TCCCCGACTC TAACTCCACA GCCCTGGCGA 
CTAGTGCACA AACTCCCACG GTGAGAAAAA CAATTGTTAC CCTAAACACT CCGCCTCCCA 
CCACAGGCGA CAGACTTTTT GATCTTGTGC AGGGGGCCTT CCTAACCTTA AATGCTACCA 
ACCCAGGGGC CACTGAGTCT TGCTGGCTTT GTTTGGCCAT GGGCCCCCCT TATTATGAAG 
CAATAGCCTC ATCAGGAGAG GTCGCCTACT CCACCGACCT TGACCGGTGC CGCTGGGGGA 
CCCAAGGAAA GCTCACCCTC ACTGAGGTCT CAGGACACGG GTTGTGCATA GGAAAGGTGC 
CCTTTACCCA TCAGCATCTC TGCAATCAGA CCCTATCCAT CAATTCCTCC GGAGACCATC 
AGTATCTGCT CCCCTCCAAC CATAGCTGGT GGGCTTGCAG CACTGGCCTC ACCCCTTGCC 
TCTCCACCTC AGTTTTTAAT CAGACTAGAG ATTTCTGTAT CCAGGTCCAG CTGATTCCTC 
GCATCTATTA CTATCCTGAA GAAGTTTTGT TACAGGCCTA TGACAATTCT CACCCCAGGA 
CTAAAAGAGA GG CTGTCTC A CTTACCCTAG CTGTTTTACT GGGGTTGGGA ATCACGGCGG 
GAATAGGTAC TGGTTCAACT GCCTTAATTA AAGGACCTAT AG ACCTCCAG CAAGGCCTGA 
CAAGCCTCCA GATCGCCATA GATGCTGACC TCCGGGCCCT CCAAGACTCA GTCAGCAAGT 
TAGAGGACTC ACTGACTTCC CTGTCCGAGG TAGTGCTCCA AAATAGGAGA GGCCTTGACT 
TGCTGTTTCT AAAAGAAGGT GGCCTCTGTG CGGCCCTAAA GGAAGAGTGC TGTTTTTACA 
TAGACCACTC AGGTGCAGTA CGGGACTCCA TGAAAAAACT CAAAGAAAAA CTGGATAAAA 
GACAGTTAGA GCGCCAGAAA AGCCAAAACT GGTATGAAGG ATGGTTCAAT AACTCCCCTT 



5760 

5820 

5880 

5940 

6000 
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6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 
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6900 
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GGTTCACTAC 


CCTGCTATCA 


ACCATCGCTG 


TCCTCGGGCC 


ATGCATCATC 


AATAAGTTAG 


GTTAAAATTC 


TGGTCCTTAG 


ACAAAATATC 


TTTGCTCTAA 


GATTAGAGCT 


ATTCACAAGA 


TTAGCCAACT 


GCAGTAACGC 


CATTTTGCTA 


TAAGAACAGG 


GCCAAACAGG 


ATATCTGTGG 


GACAGAGGGT 


TCCCAGAAAT 


AGATGAGTCA 


CCAGAAATAG 


ATGAGTCAAC 


AGCAGTTTCC 


CATGACCGGG 


AATTCACCCC 


TGGCCTTATT 


CTGTACCCGC 


GCTTTTTGCT 


ATAAAATAAG 


TAGAGAGACT 


GAGCCGCCCG 


GGTACCCGTG 


CGGAGCCGTG 


GTCTCGTTGT 


TCCTTGGGAG 


GGGGGTCTCA 


CATTT 






34 



GGCCCCTATT 


ACTCCTCCTT 


CTGTTGCTCA 


7860 


TTCAATTCAT 


CAATGATAGG 


ATAAGTGCAT 


7920 


AGGCCCTAGA 


GAACGAAGGT 


AACCTTTAAT 


7980 


GAAATGGGGG 


AATGAAAGAA 


GTGTTTTTTT 


8040 


GGCACACCTA 


AAGGATAGGA 


AAAATACAGC 


8100 


TCATGCACCT 


GGGCCCCGGC 


CCAGGCCAAG 


8160 


ACAGCAGTTT 


CCAGCAAGGA 


CAGAGGGTTC 


8220 


AGGGTGCCCC 


TCAACCGTTT 


CAAGGACTCC 


8280 


TGAACTAACC 


AATTACCTTG 


CCTCTCGCTT 


8340 


CTCAGAAACT 


CCACCCGGAG 


CGCCAGTCCT 


8400 


TGTCCAATAA 


AACCTCTTGC 


TGATTGCATC 


8460 


GGTTTCTCCT 


AACTATTGAC 


CGCCCACTTC 


8520 
8535 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 564 base pairB 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

( ix ) FEATURE : 

(A) NAME/KEY: LTR 

(B) LOCATION: 1..564 

(D) OTHER INFORMATION: /standard_name= w 3* LTR of GaLV 
SEATO" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
AATGAAAGAA GTGTTTTTTT TTAGCCAACT GCAGTAACGC CATTTTGCTA GGCACACCTA 60 
AAGGATAGGA AAAATACAGC TAAGAACAGG GCCAAACAGG ATATCTGTGG TCATGCACCT 120 
GGGCCCCGGC CCAGGCCAAG GACAGAGGGT TCCCAGAAAT AGATGAGTCA ACAGCAGTTT 180 
CCAGCAAGGA CAGAGGGTTC CCAGAAATAG ATGAGTCAAC AGCAGTTTCC AGCAAGGACA 240 
GAGGGTTCCC AGAAATAGAT GAGTCAACAG CAGTTTCCAG AGGGTGCCCC TCAACCGTTT 300 
CAAGGACTCC CATGACCGGG AATTCACCCC TGGCCTTATT TGAACTAACC AATTACCTTG 360 
CCTCTCGCTT CTGTACCCGC GCTTTTTGCT ATAAAAATAA GCTCAGAAAC TCCACCCGGG 420 
CGCCAGTCCT TAGAGAGACT GAGCCGCCCG GGTACCCGTG TGTCCAATAA AACCTCTTGC 480 
TGATTGCATC CGGAGCCGTG GTCTCGTTGT TCCTTGGGAG GGTTTCTCCT AACTATTGAC 540 
CGCCCACTTC GGGGGTCTCA CATT 564 



WO 94/23048 



PCT/US94/03784 

36 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9661 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1..9613 

(D) OTHER INFORMATION: /etandard_name= n p558 retoviral 
vector" 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



ATTTAGGTGA 


CACTATAGAA 


CTCGAGGAAT 


TCTGAAAGAA 


GTGTTTTTCA 


AGTTAGCTGC 


60 


AGTAACGCCA 


TTTTGCTAGG 


CACACCTAAA 


GGATAGGAAA 


AATACAGCTA 


AGAACAGGGC 


120 


CAAACAGGAT 


ATCTGTGGTC 


ATGCACCTGG 


GCCCCGGCCC 


AGGCCAAGGA 


CAGAGGGTTC 


180 


CCAGAAATAG 


ATGAGTCAAC 


AGCAGTTTCC 


AGCAAGGACA 


GAGGGTTCCC 


AGAAATAGAT 


240 


GAGTCAACAG 


CAGTTTCCAG 


GGTGCCCCTC 


AACCGTTTCA 


AGGACTCCCA 


TCACCGGGAA 


300 


TTCACCCCTG 


GCCTTATTTG 


AACTAACCAA 


TTACCTTGCC 


TCTCGCTTCT 


GTACCCGCGC 


360 


TTTTTGCTAT 


AAAAATAAGC 


TCAGAAACTC 


CACCCGGAGC 


GCCAGTCCTT 


AGAGAGACTG 


420 


AGCCGCCCGG 


GTACCCGTGT 


GTCCAATAAA 


ACCTCTTGCT 


GATTGCATCC 


GGAGCCGTGG 


480 


TCTCGTTGTT 


CCTTGGGAGG 


GTTTCTCCTA 


ACTATTGACC 


GCCCACTTCG 


GGGGTCTCAC 


540 


ATTTGGGGGC 


TCGTCCGGGA 


TCGGAAACCC 


CACCCAGGGA 


CCACCGACCC 


ACCAACGGGA 


600 


GGTAAGCTGG 


CCAGCGACCG 


TTGTGTGTCT 


CGCTTCTGTG 


TCTAAGTCCG 


TAATTCTGAC 


660 


TGTCCTTGTG 


TGTCTCGCTT 


CTGTGTCTGA 


GACCGTAACT 


CTGACTGCCC 


TTGTAAGTGC 


720 


GCGCATTTTT 


TTGGTTTCAG 


TCTGTTCCGG 


GTGAATCACT 


CTGCGAGTGA 


CGTGTGAGTA 


780 


GCGAACAGAC 


GTG TTCGGGG 


CTCACCGCCT 


GGTAATCCAG 


GGAGACGTCC 


CAGGATCAGG 


640 


GGAGGACCAG 


GGACGCCTGG 


TGGACCCCTC 


GGTAACGGGT 


CGTTGTGACC 


CGATTTCATC 


900 


GCCCGTCTGG 


TAAGACGCGC 


TCTGAATCTG 


ATTCTCTCTC 


TCGGTCGCCT 


CGCCGCCGTC 


960 


TCTGGTTTCT 


TTTTGTTTCG 


TTTCTGGAAA 


GCCTCTGTGT 


CACAGTCTTT 


CTCTCCCAAA 


1020 


TCATCAATAT 


GGGACAAGAT 


AATTCTACCC 


CTATCTCCCT 


CACTCTAAAT 


CACTGGAGAG 


1080 


ATGTGAGAAC 


AAGGGCTCAC 


AATCTATCCG 


TGGAAATCAA 


AAAGGGAAAA 


TGGCAGACTT 


1140 


TCTGTTCCTC 


CGAGTGGCCC 


ACATTCGGCG 


TGGGGTGGCC 


ACCGGAGGGA 


ACTTTTAATC 


1200 


TCTCTGTCAT 


TTTTGCAGTT 


AAAAAGATTG 


TCTTTCAGGA 


GAACGGGGGA 


CATCCGGACC 


1260 


AAGTTCCATA 


TATCGTGGTA 


TGGCAGGACC 


TCGCCCAGAA 


TCCCCCACCA 


TGGGTGCCAG 


1320 


CCTCCGCCAA 


GGTCGCTGTT 


GTCTCTGATA 


CCCGAAGACC 


AGTTGCGGGG 


AGGCCATCAG 


1380 


CTCCTCCCCG 


ACCCCCCATC 


TACCCGGCAA 


CAGACGACTT 


ACTCCTCCTC 


TCTGAACCCA 


1440 


CGCCCCCGCC 


CTATCCGGCG 


GCACTGCCAC 


CCCCTCTGGC 


CCCTCAGGCG 


ATCGGACCGC 


1500 
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CGTCAGGCCA 


GATGCCCGAT 


AGTAGCGATC 


CTGAGGGGCC 


AG CCGCTGGG 


ACCAGGAGTC 


1560 


GCCGTGCCCG 


CAGTCCAGCA 


GACAACTCGG 


GTCCTGACTC 


CACTGTGATT 


TTGCCCCTCC 


1620 


GAGCCATAGG 


ACCCCCGGCC 


GAGCCCAATG 


GCCTGGTCCC 


TCTACAATAT 


TGGCCTTTTT 


1680 


CCT C AG C AG A 


TCCCGTCGTT 


TTACAACGTC 


GTGACTGGGA 


AAACCCTGGC 


GTTACCCAAC 


1740 


TTAATCG CCT 


TGCAGCACAT 


CCCCCTTTCG 


CCAGCTGGCG 


TAATAGCGAA 


GAGGCCCGCA 


1800 


CCGATCGCCC 


TTCCCAACAG 


TTGCGCAGCC 


TGAATGGCGA 


ATGGCGCTTT 


GCCTGGTTTC 


1860 


CGGCACCAGA 


AGCGGTGCCG 


GAAAGCTGGC 


TGGAGTGCGA 


TCTTCCTGAG 


GCCGATACTG 


1920 


TCGTCGTCCC 


CTCAAACTGG 


CAGATGCACG 


GTTACGATGC 


GCCCATCTAC 


ACCAACGTAA 


1980 


CCTATCCCAT 


TACGGTCAAT 


CCGCCGTTTG 


TTCCCACGGA 


GAATCCGACG 


GGTTGTTACT 


2040 


CGCTCACATT 


TAATGTTGAT 


GAAAGCTGGC 


TACAGGAAGG 


CCAGACGCGA 


AT T ATTTTTG 


2100 


ATGGCGTTAA 


CTCGGCGTTT 


CATCTGTGGT 


GCAACGGGCG 


CTGGGTCGGT 


TACGGCCAGG 


2160 


ACAGTCGTTT 


GCCGTCTGAA 


TTTGACCTGA 


GCGCATTTTT 


ACGCGCCGGA 


GAAAACCGCC 


2220 


TCGCGGTGAT 


GGTGCTGCGT 


TGGAGTGACG 


GCAGTTATCT 


GGAAGATCAG 


GATATGTGGC 


2280 


GGATGAGCGG 


CATTTTCCGT 


GACGTCTCGT 


TGCTGCATAA 


ACCGACTACA 


CAAATCAGCG 


2340 


ATTTCCATGT 


TGCCACTCGC 


TTTAATGATG 


ATTTCAGCCG 


CGCTGTACTG 


GAGGCTGAAG 


- 2400 


TTCAGATGTG 


CGGCGAGTTG 


CGTGACTACC 


TACGGGTAAC 


AGTTTCTTTA 


TGGCAGGGTG 


" 2460 


AAACGCAGGT 


CGCCAGCGGC 


ACCGCGCCTT 


TCGGCGGTGA 


AATTATCGAT 


GAGCGTGGTG 


2520 


GTTATGCCGA 


TCGCGTCACA 


CTACGTCTGA 


ACGTCGAAAA 


CCCGAAACTG 


TGGAGCGCCG 


2580 


AAATCCCGAA 


TCTCTATCGT 


GCGGTGGTTG 


AACTGCACAC 


CGCCGACGGC 


ACGCTGATTG 


2640 


AAGCAGAAGC 


CTGCGATGTC 


GGTTTCCGCG 


AGGTGCGGAT 


TGAAAATGGT 


CTGCTGCTGC 


2700 


TGAACGGCAA 


GCCGTTGCTG 


ATTCGAGGCG 


TTAACCGTCA 


CGAGCATCAT 


CCTCTGCATG 


^ 2760 


GTCAGGTCAT 


GGATGAGCAG 


ACGATGGTGC 


AGGATATCCT 


GCTGATGAAG 


CAGAACAACT 


2820 


TTAACGCCGT 


GCGCTGTTCG 


CATTATCCGA 


ACCATCCGCT 


GTGGTACACG 


CTGTGCGACC 


2880 


GCTACGGCCT 


GTATGTGGTG 


GATGAAGCCA 


ATATTGAAAC 


CCACGGCATG 


GTGCCAATGA 


2940 


ATCGTCTGAC 


CGATGATCCG 


CGCTGGCTAC 


CGGCGATGAG 


CGAACGCGTA 


ACGCGAATGG 


3000 


TGCAGCGCGA 


TCGTAATCAC 


CCGAGTGTGA 


TCATCTGGTC 


GCTGGGGAAT 


GAATCAGGCC 


3060 


ACGGCGCTAA 


TCACGACGCG 


CTGTATCGCT 


GGATCAAATC 


TGTCGATCCT 


TCCCGCCCGG 


3120 


TGCAGTATGA 


AGGCGGCGGA 


GCCGACACCA 


CGGCCACCGA 


TATTATTTGC 


CCGATGTACG 


3180 


CGCGCGTGGA 


TGAAGACCAG 


CCCTTCCCGG 


CTGTGCCGAA 


ATGGTCCATC 


AAAAAATGGC 


3240 


TTTCGCTACC 


TGGAGAGACG 


CGCCCGCTGA 


TCCTTTGCGA 


ATACGCCCAC 


GCGATGGGTA 


3300 


ACAGTCTTGG 


CGGTTTCGCT 


AAATACTGGC 


AGGCGTTTCG 


TCAGTATCCC 


CGTTTACAGG 


3360 


GCGGCTTCGT 


CTGGGACTGG 


GTGGATCAGT 


CGCTGATTAA 


ATATGATGAA 


AACGGCAACC 


3420 


CGTGGTCGGC 


TTACGGCGGT 


GATTTTGGCG 


ATACGCCGAA 


CGATCGCCAG 


TTCTGTATGA 


3480 


ACGGTCTGGT 


CTTTGCCGAC 


CGCACGCCGC 


ATCCAGCGCT 


GACGGAAGCA 


AAACACCAGC 


3540 


AGCAGTTTTT 


CCAGTTCCGT 


TTATCCGGGC 


AAACCATCGA 


AGTGACCAGC 


GAATACCTGT 


3600 




WO 94/23048 

TCCGTCATAG CGATAACGAG CTCCTGCACT 
CAAGCGGTGA AGTGCCTCTG GATGTCGCTC 
AACTACCGCA GCCGGAGAGC GCCGGGCAAC 
ACGCGACCGC ATGGTCAGAA GCCGGGCACA 
AAAACCTCAG TGTGACGCTC CCCGCCGCGT 
AAATGGATTT TTGCATCGAG CTGGGTAATA 
TTCTTTCACA GATGTGGATT GGCGATAAAA 
TCACCCGTGC ACCGCTGGAT AACGACATTG 
ACGCCTGGGT CGAACGCTGG AAGGCGGCGG 
AGTGCACGGC AGATACACTT GCTGATGCGG 
ATCAGGGGAA AACCTTATTT ATC AG CCGG A 
TGGCGATTAC CGTTGATGTT GAAGTGGCGA 
TGAACTGCCA GCTGGCGCAG GTAGCAGAGC 
AAAACTATCC CGACCGCCTT ACTGCCGCCT 
ACATGTATAC CCCGTACGTC TTCCCGAGCG 
TGAATTATGG CCCACACCAG TGGCGCGGCG 
AACAGCAACT GATGGAAACC AGCCATCGCC 
TGAATATCGA CGGTTTCCAT ATGGGGATTG 
CGGCGGAATT GCAGCTGAGC GCCGGTCGCT 
AATAATAACC GGGCAGGCCA TGTCTGCCCG 
ATTTCTAGAG aatTCCCCCC TCTCCCTCCC 
TGGAATAAGG CCGGTGTGCG TTTGTCTATA 
GCAATGTGAG GGCCCGGAAA CCTGGCCCTG 
CCCCTCTGCG CAAAGGAATG CAAGGTCTGT 
AAGCTTCTTG AAGACAAACA ACGTCTGTAG 
CTGGCGACAG GTGCCTCTGC GGCCAAAAGC 
CACAACCCCA GTGCCACGTT GTGAGTTGGA 
CAAGCGTATT CAACAAGGGG CTGAAGGATG 
ATCTGGGGCC TCGGTGCACA TGCTTTACAT 
CCCCCCGAAC CACGGGGACG TGGTTTTCCT 
TCCTAGGCTT TTGCAAAAAG CTCCCGGGAG 
AGACAGGATG AGGATCGTTT CGCATGATTG 
CCGCTTGGGT GGAGAGGCTA TTCGGCTATG 
ATGCCGCCGT GTTCCGGCTG TCAGCGCAGG 
TGTCCGGTGC CCTGAATGAA CTGCAGGACG 





• 


PCT/US94/03784 




38 








GGATGGTGGC 


GCTGGATGGT 


AAGCCGCTGG 


3660 


CACAAGGTAA 


ACAGTTGATT 


GAACTGCCTG 


3720 


TCTGGCTCAC 


AGTACGCGTA 


GTGCAACCGA 


3780 


TCAGCGCCTG 


GCAGCAGTGG 


CGTCTGG CGG 


3840 


CCCACGCCAT 


CCCGCATCTG 


ACCACCAGCG 


3900 


AGCGTTGGCA 


ATTTAACCGC 


CAGTCAGGCT 


3960 


AACAACTGCT 


GACGCCGCTG 


CGCGATCAGT 


4020 


GCGTAAGTGA 


AGCGACCCGC 


ATTGACCCTA 


4080 


GCCATTACCA 


GGCCGAAGCA 


GCGTTGTTGC 


4140 


TGCTGATTAC 


GACCGCTCAC 


GCGTGGCAGC 


4200 


AAACCTACCG 


GATTGATGGT 


AGTGGTCAAA 


4260 


GCGATACACC 


G CATC CGGCG 


CGGATTGGCC 


4320 


GGGTAAACTG 


GCTCGGATTA 


GGGCCGCAAG 


4380 


GTTTTGACCG 


CTGGGATCTG 


CCATTGTCAG 


4440 


AAAACGGTCT 


GCGCTGCGGG 


ACGCGCGAAT 


4500 


ACTTCCAGTT 


CAACATCAGC 


CGCTACAGTC 


4560 


ATCTGCTGCA 


CGCGGAAGAA 


GGCACATGGC 


4620 


GTGGCGACGA 


CTCCTGGAGC 


CCGTCAGTAT 


4680 


ACCATTACCA 


GTTGGTCTGG 


TGTCAAAAAT 


4740 


TATTTCGCGT 


AAGGAAATCC 


ATTATGTACT 


4800 


CCCCCCCTAA 


CGTTACTGGC 


CGAAGCCGCT 


4860 


XGTTATTTTC 


CACCATATTG 


CCGTCTTTTG 


4920 


TCTTCTTGAC 


GAGCATTCCT 


AGGGGTCTTT 


4980 


TGAATGTCGT 


GAAGGAAGCA 


GTTCCTCTGG 


5040 


CGACCCTTTG 


CAGGCAGCGG 


AACCCCCCAC 


5100 


CACGTGTATA 


AGATACACCT 


GCAAAGGCGG 


5160 


TAGTTGTGGA 


AAGAGTCAAA 


TGGCTCTCCT 


5220 


CCCAGAAGGT 


ACCCCATTGT 


ATGGGATCTG 


5280 


GTGTTTAGTC 


GAGGTTAAAA 


AACGTCTAGG 


5340 


TTGAAAAACA 


CGATGATAAT 


ATGGCCAAGC 


5400 


CTTGGATATC 


CATTTTCGGA 


TCTGATCAAG 


5460 


AACAAGATGG 


ATTGCACGCA 


GGTTCTCCGG 


5520 


ACTGGGCACA 


ACAGACAATC 


GGCTGCTCTG 


5580 


GG CGCCCGCi 1 






5&A0 


AGGCAGCGCG 


GCTATCGTGG 


CTGGCCACGA 


5700 
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CGGGCGTTCC 

TATTGGGCGA 

TATCCATCAT 

TCGACCACCA 

TCGATCAGGA 

GGCTCAAGGC 

TGCCGAATAT 

GTGTGGCGGA 

GCGGCGAATG 

GCATCGCCTT 

GACTTGCTGT 

TACATAGACC 

AAAAGACAGT 

CCTTGGTTCA 

CTCATCCTCG 

TGTTAAAATT 

TTTTGCTCTA 

TTTAGCCAAC 

CTAAGAACAG 

GGACAGAGGG 

CCCAGAAATA 

TGAGTCAACA 

ATTCACCCCT 

CTTTTTGCTA 

AGCCGCCCGG 

TCTCGCTGTT 

CATTGGAATT 

TCGATAAGCC 

CGTATTGGGC 

CGGCGAGCGG 

AACGCAGGAA 

GCGTTGCTGG 

TCAAGTCAGA 

AGCTCCCTCG 

CTCCCTTCGG 



TTGCGCAGCT 
AGTGCCGGGG 
GGCTGATGCA 
AGCGAAACAT 
TGATCTGGAC 
GCGCATGCCC 
CATGGTGGAA 
CCGCTATCAG 
GGCTGACCGC 
CTATCGCCTT 
TTCTAAAAGA 
ACTCAGGTGC 
TAGAGCGCCA 
CTACCCTGCT 
GGCCATGCAT 
CTGGTCCTTA 
AGATTAGAGC 
TGCAGTAACG 
GGCCAAACAG 
TTCCCAGAAA 
GATGAGTCAA 
GCAGTTTCCA 
GGCCTTATTT 
TAAAATAAGC 
GTACCCGTGT 
CCTTGGGAAG 
CATCGATGAT 
AGGTTAACCT 
GCTCTTCCGC 
TATCAGCTCA 
AGAACATGTG 
CGTTTTTCCA 
GGTGGCGAAA 
TGCGCTCTCC 
GAAGCGTGGC 



GTGCTCGACG 
CAGGATCTCC 
ATGCGGCGGC 
CGCATCGAGC 
GAAGAGCATC 
GACGGCGAGG 
AATGGCCGCT 
GACATAGCGT 
TTCCTCGTGC 
CTTGACGAGT 
AGGTGGCCTC 
AGTACGGGAC 
GAAAAG CCAA 
ATCAACCATC 
CAATAAGTTA 
GACAAAATAT 
TATTCACAAG 
CCATTTTGCT 
GATATCTGTG 
TAG ATG AGTC 
CAGCAGTTTC 
GGGTGCCCCT 
GAACTAACCA 
TCAGAAACTC 
GATCAAXAAA 
GTCTCCCCTA 
ATCAGATCTG 
GCATTAATGA 
TTCCTCGCTC 
CTCAAAGGCG 
AGCAAAAGGC 
TAGGCTCCGC 
CCCGACAGGA 
TGTTCCGACC 
GCTTTCTCAA 
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TTGTCACTGA 
TGTCATCTCA 
TGCATACGCT 
GAGCACGTAC 
AGGGGCTCGC 
ATCTCGTCGT 
TTTCTGGATT 
TGGCTACCCG 
TTTACGGTAT 
TCTTCTGAGC 
TGTGCGGCCC 
TCCATGAAAA 
AACTGGTATG 
GCTGGGCCCC 
GTTCAATTCA 
CAGGCCCTAG 
AGAAATGGGG 
AGGCACACCT 
GTCATGCACC 
AACAGCAGTT 
CAGCAAGGAC 
CAACCGTTTC 
ATTACCTTGC 
CACCCGGAGC 
ACCTCTTGCT 
ATTGATTGAC 
CCGGTCTCCC 
ATCGGCCAAC 
ACTGACTCGC 
GTAATACGGT 
CAGCAAAAGG 
CCCCCTGACG 
CTATAAAGAT 
CTGCCGCTTA 
TGCTCACGCT 



AGCGGGAAGG 
CCTTGCTCCT 
TGATCCGGCT 
TCGGATGGAA 
GCCAGCCGAA 
GACCCATGGC 
CATCGACTGT 
TGATATTGCT 
CGCCGCTCCC 
GGGACTCTGG 
TAAAGGAAGA 
AACTCAAAGA 
AAGGATGGTT 
TATTACTCCT 
TCAATGATAG 
AGAACGAAGG 
GAATGAAAGA 
AAAGGATAGG 
TGGGCCCCGG 
TCCAGCAAGG 
AGAGGGTTCC 
AAGGACTCCC 
CTCTCGCTTC 
GCCAGTCCTT 
ACTTGCATCC 
CGCCCGGACT 
TATAGTGAGT 
GCGCGGGGAG 
TGCGCTCGGT 
TATCCACAGA 
CCAGGAACCG 
AGCATCACAA 
ACCAGGCGTT 
CCGGATACCT 
GTAGGTATCT 
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GACTGGCTGC 5760 

GCCG AG AAAG 5820 

ACCTG CCCAT 5880 

GCCGGTCTTG 5940 

CTGTTCGCCA . 6000 

GATGCCTGCT 6060 

GGCCGGCTGG 6120 

G AAG AG CTTG 6180 

GATTCGCAGC 6240 

GGTTCGCCTT 6300 

GTGCTGTTTT 6360 

AAAACTGG AT 6420 

CAATAACTCC 6480 

CCTTCTGTTG 6540 

GATAAGTGCA 6600 

TAACCTTTAA 6660 

AGTGTTTTTT 6720 

AAAAATACAG 6780 

CCC AGGCCAA 6840 

ACAGAGGGTT 6900 

CAGAAATAGA 6960 

ATGACCGGGA 7020 

TGTACCCGCG 7080 

AG AG AG ACTG 7140 

GAAGTCGTGG 7200 

GGGGGTCTCT 7260 

CGTATTAATT 7320 

AGGCGGTTTG 7380 

CGTTCGGCTG 7440 

ATCAGGGGAT 7500 

TAAAAAGGCC 7560 

AAATCGACGC 7620 

TCCCCCTGGA 7680 

GTCCGCCTTT 7740 
CAGTTCGGTG 7800 
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TAGGTCGTTC 


GCTCCAAGCT 


GGG CTGTGTG 


GCCTTATCCG 


GTAACTATCG 


TCTTGAGTCC 


GCAGCAGCCA 


CTGGTAACAG 


GATTAGCAGA 


TTGAAGTGGT 


GGCCTAACTA 


CGGCTACACT 


CTGAAGCCAG 


TTACCTTCGG 


AAAAAGAGTT 


GCTGGTAGCG 


GTGGTTTTTT 


TGTTTGCAAG 


CAAGAAGATC 


CTTTGATCTT 


TTCTACGGGG 


TAAGGGATTT 


TGGTCATGAG 


ATTATCAAAA 


AAATGAAGTT 


TTAAATCAAT 


CTAAAGTATA 


TGCTTAATCA 


GTGAGGCACC 


TATCTCAGCG 


TGACTCCCCG 


TCGTGTAGAT 


AACTACGATA 


GCAATGATAC 


CGCGAGACCC 


ACGCTCACCG 


GCCGGAAGGG 


CCGAGCGCAG 


AAGTGGTCCT 


AATTGTTGCC 


GGGAAGCTAG 


AGTAAGTAGT 


GCCATTGCTA 


CAGGCATCGT 


GGTGTCACGC 


GGTTCCCAAC 


GATCAAGGCG 


AGTTACATGA 


TCCTTCGGTC 


CTCCGATCGT 


TGTCAGAAGT 


ATGGCAGCAC 


TGCATAATTC 


TCTTACTGTC 


GGTGAGTACT 


CAACCAAGTC 


ATTCTGAGAA 


CCGGCGTCAA 


TACGGGATAA 


TACCGCGCCA 


GGAAAACGTT 


CTTCGGGGCG 


AAAACTCTCA 


ATGTAACCCA 


CTCGTGCACC 


CAACTGATCT 


GGGTGAGCAA 


AAACAGGAAG 


GCAAAATGCC 


TGTTGAATAC 


TCATACTCTT 


CCTTTTTCAA 


CTCATGAGCG 


GATACATATT 


TGAATGTATT 


ACATTTCCCC 


GAAAAGTGCC 


ACCTGACGTC 


TATAAAAATA 


GGCGTATCAC 


GAGGCCCTTT 


AACCTCTGAC 


ACATGCAGCT 


CCCGGAGACG 


AGCAGACAAG 


CCCGTCAGGG 


CGCGTCAGCG 





• 


PCT/US94/03784 




40 








CACGAACCCC 


CCGTTCAGCC 


CGACCGCTGC 


7860 


AACCCGGTAA 


GACACGACTT 


ATCGCCACTG 


7920 


GCGAGGTATG 


TAGGCGGTGC 


TACAGAGTTC 


7980 


AGAAGGACAG 


TATTTGGTAT 


CTGCGCTCTG 


8040 


GGTAGCTCTT 


GATCCGGCAA 


ACAAACCACC 


8100 


CAGCAGATTA 


CGCGCAGAAA 


AAAAGGATCT 


8160 


TCTGACGCTC 


AGTGGAACGA 


AAACTCACGT 


8220 


AGGATCTTCA 


CCTAGATCCT 


TTTAAATTAA 


6280 


T ATG AG T AAA 


CTTGGTCTGA 


CAGTTACCAA 


8340 


ATCTGTCTAT 


TTCGTTCATC 


CATAGTTGCC 


8400 


CGGGAGGGCX 


TACCATCTGG 


CCCCAGTGCT 


8460 


GCTCCAGATT 


TATCAGCAAT 


AAACCAGCCA 


8520 


GCAACTTTAT 


CCGCCTCCAT 


CCAGTCTATT 


6580 


TCGCCAGTTA 


ATAGTTTGCG 


CAACGTTGTT 


8640 


TCGTCGTTTG 


GTATGGCTTC 


ATTCAGCTCC 


8700 


TCCCCCATGT 


TGXG CAAAAA 


AGCGGTTAGC 


8760 


AAGTTGGCCG 


CAGTGTTATC 


ACTCATGGTT 


8820 


ATGCCATCCG 


TAAGATGCTT 


TTCTGTGACT 


8880 


TAGTGTATGC 


GGCGACCGAG 


TTGCTCTTGC 


8940 


CATAGCAGAA 


CTTTAAAAGT 


GCTCATCATT 


9000 


AGGATCTTAC 


CGCTGTTGAG 


ATCCAGTTCG 


9060 


TCAGCATCTT 


TTACTTTCAC 


CAGCGTTTCT 


9120 


GCAAAAAAGG 


GAATAAGGGC 


GACACGGAAA 


9180 


TATTATTGAA 


GCATTTATCA 


GGGTTATTGT 


9240 


TAGAAAAATA 


AACAAATAGG 


GGTTCCGCGC 


9300 


TAAGAAACCA 


TTATTATCAT 


GACATTAACC 


9360 


CGTCTCGCGC 


GTTTCGGTGA 


TGACGGTGAA 


9420 


GT C AC AG CTT 


UlUlulnnViU 




740U 


GGTGTTGGCG 


GGTGTCGGGG 


CTGGCTTAAC 


9540 
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TATGCGGCAT CAGAGCAGAT TGTACTGAGA GTGCACCATA TGGACATATT GTCGTTAGAA 9600 

CGCGG CTACA ATTAATACAT AACCTTATGT ATCATACACA TACGATTTAG GTGACACTAT 9660 

A 9661 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10306 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1.. 10258 

(D) OTHER INFORMATION: /standard_name= "pS21 retroviral 
vector" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



AAGCTTCGGC 


CAAGTGCGGC 


CCTTCCGTTT 


CTTTGCTTTT 


GAAAGACCCC 


ACCCGTAGGT 


60 


GGCAAGCTAG 


CTTAAGTAAC 


GCCACTTTGC 


AAGGCATGGA 


AAAATACATA 


ACTGAGAATA 


120 


GGAAAGTTCA 


GATCAAGGTC 


AGGAACAAAG 


AAACAGCTGA 


ATACCAAACA 


GGATATCTGT 


180 


GGTAAGCGGT 


TCCTGCCCCC 


GGCTCAGGGC 


CAAGAACAGA 


TGAGACAGCT 


GAGTGATGGG 


240 


CCAAACAGGA 


TATCTGTGGT 


AAGCAGTTCC 


TGCCCCGGCT 


CGGGGCCAAG 


AACAGATGGT 


300 


CCCCAGATGC 


GGTCCAGCCC 


TCAGCAGTTT 


CTAGTGAATC 


ATCAGATGTT 


TCCAGGGTGC 


360 


CCCAAGGACC 


TGAAAATGAC 


CCTGTACCTT 


ATTTGAACTA 


ACCAATCAGT 


TCG CTTCTCG 


420 


CTTCTGTTCG 


CGCGCTTCCG 


CTCTCCGAGC 


TCAATAAAAG 


AGCCCACAAC 


CCCTCACTCG 


4. 480 


GCGCGCCAGT 


CTTCCGATAG 


ACTGCGTCGC 


CCGGGTACCC 


GTATTCCCAA 


TAAAGCCTCT 


540 


TGCTGTTTGC 


ATCCGAATCG 


TGGTCTCGCT 


GTTCCTTGGG 


AGGGTCTCCT 


CTGAGTGATT 


600 


GACTACCCAC 


GACGGGGGTC 


TTTCATTTGG 


GGGCTCGTCC 


GGGATTTGGA 


GACCCCTGCC 


660 


CAGGGACCAC 


CGACCCACCA 


CCGGGAGGTA 


AGCTGGCCAG 


CAACCTATCT 


GTGTCTGTCC 


720 


GATTGTCTAG 


TGTCTATGTT 


TGATGTTATG 


CGCCTGCGTC 


TGTACTAGTT 


AGCTAACTAG 


780 


CTCTGTATCT 


GGCGGACCCG 


TGGTGGAACT 


GACGAGTTCT 


GAACACCCGG 


CCGCAACCCA 


840 


GGGAGACGTC 


CCAGGGACTT 


TGGGGGCCGT 


TTTTGTGGCC 


CGACCTGAGG 


AAGGGAGTCG 


900 


ATGTGGAATC 


CGACCCCGTC 


AGGATATGTG 


GTTCTGGTAG 


GAGACGAGAA 


CCTAAAACAG 


960 


TTCCCGCCTC 


CGTCTGAATT 


TTTGCTTTCG 


GTTTGGAACC 


GAAGCCGCGC 


GTCTTGTCTG 


1020 


CTGCAGCATC 


GTTCTGTGTT 


GTCTCTGTCT 


GACTGTGTTT 


CTGTATTTGT 


CTGAAAATTA 


1080 


GGGCCAGACT 


GTTACCACTC 


CCTTAAGTTT 


GACCTTAGGT 


CACTGGAAAG 


ATGTCGAGCG 


1140 


GATCGCTCAC 


AACCAGTCGG 


T AG ATGTCAA 


GAAGAGACGT 


TGGGTTACCT 


TCTGCTCTGC 


1200 


AGAATGGCCA 


ACCTTTACGT 


CGGATGGCCG 


CGAGACGGCA 


CCTTTAACCG 


AGACCTCATC 


1260 


ACCCAGGTTA 


AGATCAAGGT 


CTTTTCACCT 


GGCCCGCATG 


GACACCCAGA 


CCAGGTCCCC 


1320 



23048 
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TACATCGTGA 


CCTGGGAAGC 


CTTGGCTTTT 


GACCCCCCTC 


CCTGGGTCAA 


GCCCTTTGTA 


1380 


CACCCTAAGC 


CTCCGCCTCC 


TCTTCCTCCA 


TCCGCCCCGT 


CTCTCCCCCT 


TGAACCTCCT 


1440 


CGTTCGACCC 


CGCCTCGATC 


CTCCCTTTAT 


CCAGCCCTCA 


CTCCTTCTCT 


AGGCGGGAAT 


1500 


TCGTTAACTC 


GACCCGCGGG 


TCGACTCGCG 


AAGATCTTTC 


CGCAGCAGCC 


GCCACCATGG 


1560 


TTACGGATTC 


GGATCCCGTC 


GTTTTACAAC 


GTCGTGACTG 


GGAAAACCCT 


GGCGTTACCC 


1620 


AACTTAATCG 


CCTTGCAGCA 


CATCCCCCTT 


TCGCCAGCTG 


GCGTAATAGC 


GAAGAGGCCC 


1680 


GCACCGATCG 


CCCTTCCCAA 


CAGTTGCGCA 


GCCTGAATGG 


CGAATGGCGC 


TTTGCCTGGT 


1740 


TTCCGGCACC 


AGAAGCGGTG 


CCGG AAAGCT 


GGCTGGAGTG 


CGATCTTCCT 


GAGGCCGATA 


1800 


CTGTCGTCGT 


CCCCTCAAAC 


TGGCAGATGC 


ACGGTTACGA 


TGCGCCCATC 


TACACCAACG 


1860 


TAACCTATCC 


CATTACGGTC 


AATCCGCCGT 


TTGTTCCCAC 


GGAGAATCCG 


ACGGGTTGTT 


1920 


ACTCGCTCAC 


ATTTAATGTT 


GATGAAAGCT 


GGCTACAGGA 


AGGCCAGACG 


CGAATTATTT 


1980 


TTGATGGCGT 


TAACTCGGCG 


TTTCATCTGT 


GGTGCAACGG 


GCGCTGGGTC 


GGTTACGGCC 


2040 


AGGACAGTCG 


TTTGCCGTCT 


GAATTTGACC 


TGAGCGCATT 


TTTACGCGCC 


GGAGAAAACC 


2100 


GCCTCGCGGT 


GATGGTGCTG 


CGTTGGAGTG 


ACGGCAGTTA 


TCTGGAAGAT 


CAGGATATGT 


2160 


GGCGGATGAG 


CGGCATTTTC 


CGTGACGTCT 


CGTTGCTGCA 


TAAACCGACT 


ACACAAATCA 


2220 


GCGATTTCCA 


TGTTGCCACT 


CGCTTTAATG 


ATGATTTCAG 


CCGCGCTGTA 


CTGGAGGCTG 


2280 


AAGTTCAGAT 


GTGCGGCGAG 


TTGCGTGACT 


ACCTACGGGT 


AACAGTTTCT 


TTATGGCAGG 


2340 


GTGAAACGCA 


GGTCGCCAGC 


GGCACCGCGC 


CTTTCGGCGG 


TGAAATTATC 


GATGAGCGTG 


2400 


GTGGTTATGC 


CGATCGCGTC 


ACACTACGTC 


TGAACGTCGA 


AAACCCGAAA 


CTGTGGAGCG 


2460 


CCGAAATCCC 


GAATCTCTAT 


CGTGCGGTGG 


TTGAACTGCA 


CACCGCCGAC 


GGCACGCTGA 


2520 


TTGAAGCAGA 


AGCCTGCGAT 


GTCGGTTTCC 


GCGAGGTGCG 


GATTGAAAAT 


GGTCTGCTGC 


2580 


TGCTGAACGG 


CAAGCCGTTG 


CTGATTCGAG 


GCGTTAACCG 


TCACGAGCAT 


CATCCTCTGC 


2640 


ATGGTCAGGT 


CATGGATGAG 


CAGACGATGG 


TGCAGGATAT 


CCTGCTGATG 


AAGCAGAACA 


2700 


ACTTTAACGC 


CGTGCGCTGT 


TCGCATTATC 


CGAACCATCC 


GCTGTGGTAC 


ACGCTGTGCG 


2760 


ACCGCTACGG 


CCTGTATGTG 


GTGGATGAAG 


CCAATATTGA 


AACCCACGGC 


ATGGTGCCAA 


2820 


TGAATCGTCT 


GACCGATGAT 


CCGCGCTGGC 


TACCGGCGAT 


GAGCGAACGC 


GTAACGCGAA 


2880 


TGGTGCAGCG 


CGATCGTAAT 


CACCCGAGTG 


TGATCATCTG 


GTCGCTGGGG 


AATGAATCAG 


2940 


GCCACGGCGC 


TAATCACGAC 


GCGCTGTATC 


GCTGGATCAA 


ATCTGTCGAT 


CCTTCCCGCC 


3000 


CGGTGCAGTA 


TGAAGGCGGC 


GGAGCCGACA 


CCACGGCCAC 


CGATATTATT 


TGCCCGATGT 


3060 


ACGCGCGCGT 


GGATGAAGAC 


CAGCCCTTCC 


CGGCTGTGCC 


GAAATGGTCC 


ATCAAAAAAT 


3120 


GGCTTTCGCT 


ACCTGGAGAG 


ACGCGCCCGC 


TGATCCTTTG 


CGAATACGCC 


CACGCGATGG 


3180 


GTAACAGTCT 


TGGCGGTTTC 


GCTAAATACT 


GGCAGGCGTT 


TCGTCAGTAT 


CCCCGTTTAC 


3240 


AGGGCGGCTT 


CGTCTGGGAC 


TGGGTGGATC 


AGTCG CTGAT 


TAAATATGAT 


GAAAACGGCA 


3300 


AuCCGTGUTC 


GGUT x AlAvtv w 


J. Vrn illlw 




CAACCATCCC 


w x X w x w x n 


J J O \J 


TGAACGGTCT 


GGTCTTTGCC 


GACCGCACGC 


CGCATCCAGC 


GCTGACGGAA 


GCAAAACACC 


3420 
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AGCAGCAGTT 


TTTCCAGTTC 


CGTTTATCCG 


GGCAAACCAT 


CGAAGTGACC 


AGCGAATACC 


3480 


TGTTCCGTCA 


TAG CG AT AAC 


GAGCTCCTGC 


ACTGGATGGT 


GGCGCTGGAT 


GGTAAGCCGC 


3540 


TGGCAAGCGG 


TGAAGTG CCT 


CTGGATGTCG 


CTCCACAAGG 


TAAACAGTTG 


ATTGAACTGC 


3600 


CTGAACTACC 


GCAGCCGGAG 


AGCGCCGGGC 


AACTCTGGCT 


CACAGTACGC 


GTAGTGCAAC 


3660 


CGAACGCGAC 


CGCATGGTCA 


GAAGCCGGGC 


ACATCAGCGC 


CTGGCAGCAG 


TGGCGTCTGG 


3720 


CGGAAAACCT 


CAGTGTGACG 


CTCCCCGCCG 


CGTCCCACGC 


CATCCCGCAT 


CTGACCACCA 


3780 


GCGAAATGGA 


TTTTTGCATC 


GAGCTGGGTA 


ATAAGCGTTG 


GCAATTTAAC 


CGCCAGTCAG 


3840 


GCTTTCTTTC 


ACAGATGTGG 


ATTGGCGATA 


AAAAACAACT 


GCTGACGCCG 


CTGCGCGATC 


3900 


AGTTCACCCG 


TGCACCGCTG 


GATAACGACA 


TTGGCGTAAG 


TGAAGCGACC 


CGCATTGACC 


3960 


CTAACGCCTG 


GGTCGAACGC 


TGGAAGGCGG 


CGGGCCATTA 


CCAGGCCGAA 


GCAGCGTTGT 


4020 


TGCAGTGCAC 


GGCAGATACA 


CTTGCTGATG 


CGGTGCTGAT 


TACGACCGCT 


CACGCGTGGC 


4080 


AGCATCAGGG 


GAAAACCTTA 


TTTATCAGCC 


GGAAAACCTA 


CCGGATTGAT 


GGTAGTGGTC 


4140 


AAATGGCGAT 


TACCGTTGAT 


GTTGAAGTGG 


CGAGCGATAC 


ACCGCATCCG 


GCGCGGATTG 


4200 


GCCTGAACTG 


CCAGCTGGCG 


CAGGTAGCAG 


AGCGGGTAAA 


CTGGCTCGGA 


TTAGGGCCGC 


4260 


AAGAAAACTA 


TCCCGACCGC 


CTTACTGCCG 


CCTGTTTTGA 


CCGCTGGGAT 


CTGCCATTGT - 


4320 


CAGACATGTA 


TACCCCGTAC 


GTCTTCCCGA 


GCGAAAACGG 


TCTGCGCTGC 


GGGACGCGCG 


4380 


AATTGAATTA 


TGGCCCACAC 


CAGTGGCGCG 


GCGACTTCCA 


GTTCAACATC 


AGCCGCTACA 


4440 


GXCAACAGCA 


ACTGATGGAA 


ACCAGCCATC 


GCCATCTGCT 


GCACGCGGAA 


GAAGGCACAT 


4500 


GGCTGAATAT 


CGACGGTTTC 


CATATGGGGA 


TTGGTGGCGA 


CGACTCCTGG 


AGCCCGTCAG 


4560 


TATCGGCGGA 


ATTGCAGCTG 


AGCGCCGGTC 


GCTACCATTA 


CCAGTTGGTC 


TGGTGTCAAA 


4620 


AATAATAATA 


ACCGGGCAGG 


CCATGTCTGC 


CCGTATTTCG 


CGTAAGGAAA 


TCCATTATGT ■£ 


4680 


ACT ATTT CT A 


GAGAATTCCC 


CCCTCTCCCT 


CCCCCCCCCC 


TAACGTTACT 


GGCCGAAGCC 


4740 


GCTTGGAATA 


AGGCCGGTGT 


GCGTTTGTCT 


ATATGTTATT 


TTCCACCATA 


TTGCCGTCTT 


4800 


TTGGCAATGT 


GAGGGCCCGG 


AAACCTGGCC 


CTGTCTTCTT 


GACGAGCATT 


CCTAGGGGTC 


4860 


TTTCCCCTCT 


GCGCAAAGGA 


ATGCAAGGTC 


TGTTGAATGT 


CGTGAAGGAA 


GCAGTTCCTC 


4920 


TGGAAGCXTC 


TTGAAGACAA 


ACAACGTCTG 


TAGCGACCCT 


TTGCAGGCAG 


CGGAACCCCC 


4980 


CACCTGGCGA 


CAGGTGCCTC 


TGCGGCCAAA 


AGCCACGTGT 


ATAAGATACA 


CCTGCAAAGG 


5040 


CGGCACAACC 


CCAGTGCCAC 


GTTGTGAGTT 


GGATAGTTGT 


GGAAAGAGTC 


AAATGGCTCT 


5100 


CCTCAAGCGT 


ATTCAACAAG 


GGGCTGAAGG 


ATGCCCAGAA 


GGTACCCCAT 


TGTATGGGAT 


5160 


CTGATCTGGG 


GCCTCGGTGC 


ACATGCTTTA 


CATGTGTTTA 


GTCGAGGTTA 


AAAAACGTCT 


5220 


AGGCCCCCCG 


AACCACGGGG 


ACGTGGTTTT 


CCTTTGAAAA 


ACACGATGAT 


AATATGGCCA 


5280 


AGCTCCTAGG 


CTTTTGCAAA 


AAG CTCCCGG 


GAGCTTGGAT 


ATCCATTTTC 


GGATCTGATC 


5340 


AAGAGACAGG 


ATGAGGATCG 


TTTCGCATGA 


TTGAACAAGA 


TGGATTGCAC 


GCAGGTTCTC 


5400 


CGGCCGCTTG 


GGTGGAGAGG 


CTATTCGGCT 


ATGACTGGGC 


ACAACAGACA 


ATCGGCTGCT 


5460 


CTGATGCCGC 


CGTGTTCCGG 


CTGTCAGCGC 


AGGGGCGCCC 


GGTTCTTTTT 


GTCAAGACCG 


5520 
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ACCTGTCCGG 


TGCCCTGAAT 


GAACTGCAGG 


ACGAGGCAGC 


GCGGCTATCG 


TGGCTGGCCA 


5580 


CGACGGGCGT 


TCCTTGCGCA 


GCTGTGCTCG 


ACGTTGTCAC 


TGAAGCGGGA 


AGGGACTGGC 


5640 


TGCTATTGGG 


CGAAGTGCCG 


GGGCAGGATC 


TCCTGTCATC 


TCACCTTGCT 


CCTGCCGAGA 


5700 


AAGTATCCAT 


CATGGCTGAT 


GCAATGCGGC 


GGCTGCATAC 


GCTTGATCCG 


GCTACCTGCC 


5760 


CATTCGACCA 


CCAAGCGAAA 


CATCGCATCG 


AGCGAGCACG 


TACTCGGATG 


GAAGCCGGTC 


5820 


TTGTCGATCA 


GGATGATCTG 


GACGAAGAGC 


ATCAGGGGCT 


CGCGCCAGCC 


GAACTGTTCG 


5880 


CCAGGCTCAA 


GGCGCGCATG 


CCCGACGGCG 


AGGATCTCGT 


CGTGACCCAT 


GGCGATGCCT 


5940 


GCTTGCCGAA 


TATCATGGTG 


GAAAATGGCC 


GCTTTTCTGG 


ATTCATCGAC 


TGTGGCCGGC 


6000 


TGGGTGTGGC 


GGACCGCTAT 


CAGGACATAG 


CGTTGGCTAC 


CCGTGATATT 


GCTGAAGAGC 


6060 


TTGGCGGCGA 


ATGGG CTG AC 


CGCTTCCTCG 


TGCTTTACGG 


TATCGCCGCT 


CCCGATTCGC 


6120 


AGCGCATCGC 


CTTCTATCGC 


CTTCTTGACG 


AGTTCTTCTG 


AGCGGGACTC 


TGGGGTTCGC 


6180 


CTTGACTTGC 


TGTTTCTAAA 


AGAAGGTGGC 


CTCTGTGCGG 


CCCTAAAGGA 


AGAGTGCTGT 


6240 


TTTTACATAG 


ACCACTCAGG 


TGCAGTACGG 


GACTCCATGA 


AAAAACTCAA 


AGAAAAACTG 


6300 


GATAAAAGAC 


AGTTAGAGCG 


CCAGAAAAGC 


CAAAACTGGT 


ATGAAGGATG 


GTTCAATAAC 


6360 


TCCCCTTGGT 


TCACTACCCT 


GCTATCAACC 


ATCGCTGGGC 


CCCTATTACT 


CCTCCTTCTG 


6420 


TTG CTCATCC 


TCGGGCCATG 


CATCATCAAT 


AAGTTAGTTC 


AATTCATCAA 


TGATAGGATA 


6480 


AGTGCATGTT 


AAAATTCTGG 


TCCTTAGACA 


AAATATCAGG 


CCCTAGAGAA 


CGAAGGTAAC 


6540 


CTTTAATTTT 


GCTCTAAGAT 


TAGAGCTATT 


CACAAGAGAA 


ATGGGGGAAT 


GAAAGAAGTG 


6600 


TTTTTTTTTA 


GCCAACTGCA 


GTAACGCCAT 


TTTGCTAGGC 


ACACCTAAAG 


GATAGGAAAA 


6660 


ATACAGCTAA 


GAACAGGGCC 


AAACAGGATA 


TCTGTGGTCA 


TGCACCTGGG 


CCCCGGCCCA 


6720 


GGCCAAGGAC 


AGAGGGTTCC 


CAGAAATAGA 


TGAGTCAACA 


GCAGTTTCCA 


GCAAGGACAG 


6780 


AGGGTTCCCA 


GAAATAGATG 


AGTCAACAGC 


AGTTTCCAGC 


AAGGACAGAG 


GGTTCCCAGA 


6840 


AATAGATGAG 


TCAACAGCAG 


TTTCCAGGGT 


GCCCCTCAAC 


CGTTTCAAGG 


ACTCCCATGA 


6900 


CCGGGAATTC 


ACCCCTGGCC 


TTATTTGAAC 


TAACCAATTA 


CCTTGCCTCT 


CGCTTCTGTA 


6960 


CCCGCGCTTT 


TTGCTATAAA 


ATAAGCTCAG 


AAACTCCACC 


CGGAGCGCCA 


GTCCTTAGAG 


7020 


AGACTGAGCC 


G CCCGGGT AC 


CCGTGTGTCC 


AATAAAACCT 


CTTGCTGATT 


GCATCCGGAG 


7080 


CCGTGGTCTC 


GTTGTTCCTT 


GGGAGGGTTT 


CTCCTAACTA 


TTGACCGCCC 


ACTTCGGGGG 


7140 


T CT C AC ATT T 


GCGGCCGCCA 


ATTCGCCCTA 


TAGTGAGTCG 


TATTACAATT 


CACTGGCCGT 


7200 


CGTTTTACAA 


CGTCGTGACT 


GGGAAAACCC 


TGGCGTTACC 


CAACTTAATC 


GCCTTGCAGC 


7260 


ACATCCCCCT 


TTCGCCAGCT 


GGCGTAATAG 


CGAAGAGGCC 


CGCACCGATC 


GCCCTTCCCA 


7320 


ACAGTTGCGC 


AGCCTGAATG 


GCGAATGGAA 


ATTGTAAACG 


TTAATATTTT 


GTTAAAATTC 


7380 


G CGTTAAAT A 


TTTGTTAAAT 


CAGCTCATTT 


TTTAACCAAT 


AGGCCGAAAT 


CGGCAAAATC 


7440 


CCTTATAAAT 


CAAAAGAATA 


GACCGAGATA 


GGGTTGAGTG 


TTGTTCCAGT 


TTGGAACAAG 


7500 


AGTCCACTAT 






GTCAAAGGGC 


GAAAAACCGT 


CTATCAGGGC 


7560 


GATGGCCCAC 


TACGTGAACC 


ATCACCCAAA 


TCAAGTTTTT 


TGCGGTCGAG 


GTGCCGTAAA 


7620 
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GCTCTAAATC 


GG AACCCTAA 


AGGGAGCCCC 


CGATTTAGAG 


CTTGACGGGG 


AAAGCCGGCG 


7680 


AACGTGGCGA 


GAAAGGAAGG 


GAAGAAAGCG 


AAAGGAGCGG 


GCGCTAGGGC 


GCTGGCAAGT 


7740 


GTAGCGGTCA 


CGCTGCGCGT 


AACCACCACA 


CCCGCCGCGC 


TTAATGCGCC 


GCTACAGGGC 


7800 


GCGTCGCCTG 


ATGCGGTATT 


TTCTCCTTAC 


GCATCTGTGC 


GGTATTTCAC 


ACCGCATATG 


7860 


GTGCACTCTC 


AGTACAATCT 


GCTCTGATGC 


CGCATAGTTA 


AGCCAGCCCC 


GACACCCGCC 


7920 


AACACCCGCT 


GACGCGCCCT 


GACGGGCTTG 


TCTGCTCCCG 


GCATCCGCTT 


AC AG AC AAG C 


7980 


TGTGACCGTC 


TCCGGGAGCT 


G C ATGTGTC A 


GAGGTTTTCA 


CCGTCATCAC 


CGAAACGCGC 


8040 


GAGACGAAAG 


GGCCTCGTGA 


TACGCCTATT . 


TTTATAGGTT 


AATGTCATGA 


TAATAATGGT 


8100 


TTCTTAGACG 


TCAGGTGGCA 


CTTTTCGGGG 


AAATGTGCGC 


GGAACCCCTA 


TTTGTTTATT 


8160 


TTTCTAAATA 


CATTCAAATA 


TGTATCCGCT 


CATGAGACAA 


TAACCCTGAT 


AAATGCTTCA 


8220 


ATAATATTGA 


AAAAGGAAGA 


GTATGAGTAT 


TCAACATTTC 


CGTGTCGCCC 


TTATTCCCTT 


8280 


TTTTGCGGCA 


TTTTGCCTTC 


CTGTTTTTGC 


TCACCCAGAA 


ACGCTGGTGA 


AAGTAAAAGA 


8340 


TGCTGAAGAT 


CAGTTGGGTG 


CACGAGTGGG 


TTACATCGAA 


CTGGATCTCA 


ACAGCGGTAA 


8400 


GATCCTTGAG 


AGTTTTCGCC 


CCGAAGAACG 


TTTTCCAATG 


ATGAGCACTT 


TTAAAGTTCT 


8460 


GCTATGTCAT 


ACACTATTAT 


CCCGTATTGA 


CGCCGGGCAA 


GAGCAACTCG 


GTCGCCGGGC 


8520 


GCGGTATTCT 


CAGAATGACT 


XGGTTGAGTA 


CTCACCAGTC 


ACAGAAAAGC 


ATCTTACGGA 


8580 


TGGCATGACA 


GTAAGAGAAT 


TATGCAGTGC 


TGCCATAACC 


ATGAGTGATA 


ACACTGCGGC 


8640 


CAACTTACTT 


CTGACAACGA 


TCGGAGGACC 


GAAGGAGCTA 


ACCGCTTTTT 


TGCACAACAT 


8700 


GGGGGATCAT 


GTAACTCGCC 


TTGATCGTTG 


GGAACCGGAG 


CTGAATGAAG 


CCATACCAAA 


8760 


CGACGAGCGT 


GACACCACGA 


TGCCTGTAGC 


AATGCCAACA 


ACGTTGCGCA 


AACTATTAAC 


8820 


TGGCGAACTA 


CTTACTCTAG 


CTTCCCGGCA 


ACAATTAATA 


GACTGGATGG 


AGGCGGATAA ^ 


8880 


AGTTGCAGGA 


CCACTTCTGC 


GCTCGGCCCT 


TCCGGCTGGC 


TGGTTTATTG 


CTGATAAATC 


8940 


TGGAGCCGGT 


GAGCGTGGGT 


CTCGCGGTAT 


CATTGCAGCA 


CTGGGGCCAG 


ATGGTAAGCC 


9000 


CTCCCGTATC 


GTAGTTATCT 


ACACGACGGG 


GAGTCAGGCA 


ACTATGGATG 


AACGAAATAG 


9060 


ACAGATCGCT 


GAGATAGGTG 


CCTCACTGAT 


TAAGCATTGG 


TAACTGTCAG 


ACCAAGTTTA 


9120 


CTCATATATA 


CTTTAGATTG 


ATTTAAAACT 


TCATTTTTAA 


TTTAAAAGGA 


TCTAGGTGAA 


9180 


GATCCTTTTT 


GATAATCTCA 


TGACCAAAAT 


CCCTTAACGT 


GAGTTTTCGT 


TCCACTGAGC 


9240 


GTCAGACCCC 


GTAGAAAAGA 


TCAAAGGATC 


TTCTTGAGAT 


CCTTTTTTTC 


TGCGCGTAAT 


9300 


CTGCTGCTTG 


CAAACAAAAA 


AACCACCGCT 


ACCAGCGGTG 


GTTTGTTTGC 


CGGATCAAGA 


9360 


GCTACCAACT 


CTTTTTCCGA 


AGGTAACTGG 


CTTCAGCAGA 


GCGCAGATAC 


CAAATACTGT 


9420 


CCTTCTAGTG 


TAGCCGTAGT 


TAGG CCACCA 


CTTCAAGAAC 


TCTGTAGCAC 


CGCCTACATA 


9480 


CCTCGCTCTG 


CTAATCCTGT 


TACCAGTGGC 


TGCTGCCAGT 


GGCGATAAGT 


CGTGTCTTAC 


9540 


CGGGTTGGAC 


TCAAGACGAT 


AGTTACCGGA 


TAAGGCGCAG 


GGGTCGGGCT 


GAACGGGGGG 


9600 


TTCGTGCAGA 






CACCTACACC 


GAACTGAGAT 


ACCTACAGCG 


9660 


TGAGCTATGA 


GAAAGCGCCA 


CGCTTCCCGA 


AGGGAGAAAG 


GCGGACAGGT 


ATCCGGTAAG 


9720 
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CGGCAGGGTC 


GGAACAGGAG 


AGCGCACGAG 


GG AG CTTCC A 


GGGGGAAACG 


CCTGGTATCT 


9780 


TTATAGTCCT 


GTCGGGTTTC 


GCCACCTCTG 


ACTTGAGCGT 


CGATTTTTGT 


GATGCTCGTC 


9840 


AGGGGGGCGG 


AGCCTATCGA 


AAAACGCCAG 


CAACGCGGCC 


TTTTTACGGT 


TCCTGGCCTT 


9900 


TTGCTGGCCT 


TTTGCTCACA 


TGTTCTTTCC 


TGCGTTATCC 


CCTGATTCTG 


TGGATAACCG 


9960 


TATTACCGCC 


TTTGAGTGAG 


CTGATACCGC 


TCGCCGCAGC 


CGAACGACCG 


AG CG CAGCG A 


10020 


GTCAGTGAGC 


GAGGAAGCGG 


AAGAGCGCCC 


AAT ACG C AAA 


CCGCCTCTCC 


CCGCGCGTTG 


10080 


GCCGATTCAT 


TAATGCAGCT 


GGCACGACAG 


GTTTCCCGAC 


TGGAAAGCGG 


GCAGTGAGCG 


10140 


CAACGCAATT 


AATGTGAGTT 


AGCTCACTCA 


TTAGGCACCC 


CAGGCTTTAC 


ACTTTATGCT 


10200 


TCCGGCTCGT 


ATGTTGTGTG 


GAATTGTGAG 


CGGATAACAA 


TTTCACACAG 


GAAACAGCTA 


10260 


TGACCATGAT 


TACGCCAAGC 


TATTTAGGTG 


ACACTATAGA 


ATACTC 




10306 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10970 baBe pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



( ix ) FEATURE : 

(A) NAME /KEY: misc_f eature 

(B) LOCATION: 1.. 10970 

(D) OTHER INFORMATION: /standard_name= M p537 retroviral 
vector" 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



AAGCTTCGGC 


CAAGTGCGGC 


CCTTCCGTTT 


CTTTGCTTTT 


GAAAGACCCC 


ACCCGTAGGT 


60 


GGCAAGCTAG 


CTTAAGTAAC 


GCCACTTTGC 


AAGGCATGGA 


AAAATACATA 


ACTGAGAATA 


120 


GGAAAGTTCA 


GATCAAGGTC 


AGGAACAAAG 


AAACAGCTGA 


ATACCAAACA 


GGATATCTGT 


180 


GGTAAGCGGT 


TCCTGCCCCC 


GGCTCAGGGC 


CAAGAACAGA 


TGAGACAGCT 


GAGTGATGGG 


240 


CCAAACAGGA 


TATCTGTGGT 


AAGCAGTTCC 


TGCCCCGGCT 


CGGGGCCAAG 


AACAGATGGT 


300 


CCCCAGATGC 


GGTCCAGCCC 


TCAGCAGTTT 


CTAGTGAATC 


ATCAGATGTT 


TCCAGGGTGC 


360 


CCCAAGGACC 


TGAAAATGAC 


CCTGTACCTT 


ATTTGAACTA 


ACCAATCAGT 


TCGCTTCTCG 


420 


CTTCTGTTCG 


CGCGCTTCCG 


CTCTCCGAGC 


TCAATAAAAG 


AGCCCACAAC 


CCCTCACTCG 


480 


GCGCGCCAGT 


CTTCCGATAG 


ACTGCGTCGC 


CCGGGTACCC 


GTATTCCCAA 


TAAAGCCTCT 


540 


TGCTGTTTGC 


ATCCGAATCG 


TGGTCTCGCT 


GTTCCTTGGG 


AGGGTCTCCT 


CTGAGTGATT 


600 


GACTACCCAC 


GACGGGGGTC 


TTTCATTTGG 


GGGCTCGTCC 


GGGATTTGGA 


GACCCCTGCC 


660 


CAGGGACCAC 


CGACCCACCA 


CCGGGAGGTA 


AGCTGGCCAG 


CAACCTATCT 


GTGTCTGTCC 


720 


GATTGTCTAG 


TGTCTATGTT 


TGATGTTATG 


CGCCTGCGTC 


TGTACTAGTT 


AGCTAACTAG 


780 


CTCTGTATCT 


GGCGGACCCG 


TGGTGGAACT 


GACGAGTTCT 


GAACACCCGG 


CCGCAACCCA 


840 


GGGAGACGTC 


CCAGGGACTT 


TGGGGGCCGT 


TTTTGTGGCC 


CGACCTGAGG 


AAGGGAGTCG 


900 


ATGTGGAATC 


CGACCCCGTC 


AGGATATGTG 


GTTCTGGTAG 


GAGACGAGAA 


CCTAAAACAG 


960 


TTCCCGCCTC 


CGTCTGAATT 


TTTGCTTTCG 


GTTTGGAACC 


GAAGCCGCGC 


GTCTTGTCTG 


1020 


CTGCAGCATC 


GTTCTGTGTT 


GTCTCTGTCT 


GACTGTGTTT 


CTGTATTTGT 


CTGAAAATTA 


1080 


GGGCCAGACT 


GTTACCACTC 


CCTTAAGTTT 


GACCTTAGGT 


CACTGGAAAG 


ATGTCGAGCG 


1140 


GATCGCTCAC 


AACCAGTCGG 


TAGATGTCAA 


GAAGAGACGT 


TGGGTTACCT 


TCTGCTCTGC 


1200 


AGAATGGCCA 


ACCTTTACGT 


CGGATGGCCG 


CGAGACGGCA 


CCTTTAACCG 


AGACCTCATC 


1260 


ACCCAGGTTA 


AGATCAAGGT 


CTTTTCACCT 


GGCCCGCATG 


GACACCCAGA 


CCAGGTCCCC 


1320 


TACATCGTGA 


CCTGGGAAGC 


CTTGGCTTTT 


GACCCCCCTC 


CCTGGGTCAA 


GCCCTTTGTA 


1380 


CACCCTAAGC 


CTCCGCCTCC 


TCTTCCTCCA 


TCCGCCCCGT 


CTCTCCCCCT 


TGAACCTCCT 


1440 


CGTTCGACCC 


CGCCTCGATC 


CTCCCTTTAT 


CCAGCCCTCA 


CTCCTTCTCT 


AGGCGGGAAT 


1500 
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TCGTTAACTC 


GACCCGCGGG 


TCGACTCGCG 


AAGATCTTTC 


CGCAGCAGCC 


GCCACCATGG 


1560 


TTACGGATTC 


GGATCCCGTC 


GTTTTACAAC 


GTCGTGACTG 


GGAAAACCCT 


GGCGTTACCC 


1620 


AACTTAATCG 


CCTTGCAGCA 


CATCCCCCTT 


TCGCCAGCTG 


GCGTAATAGC 


GAAGAGGCCC 


1680 


GCACCGATCG 


CCCTTCCCAA 


CAGTTGCGCA 


GCCTGAATGG 


CGAATGGCGC 


TTTGCCTGGT 


1740 


TTCCGGCACC 


AGAAGCGGTG 


CCGGAAAGCT 


GGCTGGAGTG 


CGATCTTCCT 


GAGGCCGATA 


1800 


CTGTCGTCGT 


CCCCTCAAAC 


TGGCAGATGC 


ACGGTTACGA 


TGCGCCCATC 


TACACCAACG 


1860 


TAACCTATCC 


CATTACGGTC 


AATCCGCCGT 


TTGTTCCCAC 


GGAGAATCCG 


ACGGGTTGTT 


1920 


ACTCGCTCAC 


ATTTAATGTT 


GATGAAAGCT 


GGCTACAGGA 


AGGCCAGACG 


CGAATTATTT 


1980 


TTGATGGCGT 


TAACTCGGCG 


TTTCATCTGT 


GGTGCAACGG 


GCGCTGGGTC 


GGTTACGGCC 


2040 


AGGACAGTCG 


TTTGCCGTCT 


GAATTTGACC 


TGAGCGCATT 


TTTACGCGCC 


GGAGAAAACC 


2100 


GCCTCGCGGT 


GATGGTGCTG 


CGTTGGAGTG 


ACGGCAGTTA 


TCTGGAAGAT 


CAGGATATGT 


2160 


GGCGGATGAG 


CGGCATTTTC 


CGTGACGTCT 


CGTTGCTGCA 


TAAACCGACT 


ACACAAATCA 


2220 


GCGATTTCCA 


TGTTGCCACT 


CGCTTTAATG 


ATGATTTCAG 


CCGCGCTGTA 


CTGGAGGCTG 


2280 


AAGTTCAGAT 


GTGCGGCGAG 


TTGCGTGACT 


ACCTACGGGT 


AACAGTTTCT 


TTATGGCAGG 


2340 


GTGAAACGCA 


GGTCGCCAGC 


GGCACCGCGC 


CTTTCGGCGG 


TGAAATTATC 


GATGAGCGTG 


2400 


GTGGTTATGC 


CGATCGCGTC 


ACACTACGTC 


TGAACGTCGA 


AAACCCGAAA 


CTGTGGAGCG 


2460 


CCGAAATCCC 


GAATCTCTAT 


CGTGCGGTGG 


TTGAACTGCA 


CACCGCCGAC 


GGCACGCTGA 


2520 


TTGAAGCAGA 


AGCCTGCGAT 


GTCGGTTTCC 


GCGAGGTGCG 


GATTGAAAAT 


GGTCTGCTGC 


2580 


TGCTGAACGG 


CAAGCCGTTG 


CTGATTCGAG 


GCGTTAACCG 


TCACGAGCAT 


CATCCTCTGC 


2640 


ATGGTCAGGT 


CATGGATGAG 


CAGACGATGG 


TGCAGGATAT 


CCTGCTGATG 


AAGCAGAACA 


2700 


ACTTTAACGC 


CGTGCGCTGT 


TCGCATTATC 


CGAACCATCC 


G CTGTGGT AC 


ACGCTGTGCG 


2760 


ACCGCTACGG 


CCTGTATGTG 


GTGGATGAAG 


CCAATATTGA 


AACCCACGGC 


ATGGTGCCAA 


2820 


TGAATCGTCT 


GACCGATGAT 


CCGCGCTGGC 


TACCGGCGAT 


GAGCGAACGC 


GTAACGCGAA 


2880 


TGGTGCAGCG 


CGATCGTAAT 


CACCCGAGTG 


TGATCATCTG 


GTCGCTGGGG 


AATGAATCAG 


2940 


GCCACGGCGC 


TAATCACGAC 


GCGCTGTATC 


GCTGGATCAA 


ATCTGTCGAT 


CCTTCCCGCC 


3000 


CGGTGCAGTA 


TGAAGGCGGC 


GGAGCCGACA 


CCACGGCCAC 


CGATATTATT 


TGCCCGATGT 


3060 


ACGCGCGCGT 


GGATGAAGAC 


CAGCCCTTCC 


CGGCTGTGCC 


GAAATGGTCC 


ATCAAAAAAT 


3120 


GGCTTTCGCT 


ACCTGGAGAG 


ACGCGCCCGC 


TGATCCTTTG 


CGAATACGCC 


CACGCGATGG 


3180 


GTAACAGTCT 


TGGCGGTTTC 


GCTAAATACT 


GGCAGGCGTT 


TCGTCAGTAT 


CCCCGTTTAC 


3240 


AGGGCGGCTT 


CGTCTGGGAC 


TGGGTGGATC 


AGTCGCTGAT 


TAAATATGAT 


GAAAACGGCA 


3300 


ACCCGTGGTC 


GGCTTACGGC 


GGTGATTTTG 


GCGATACGCC 


GAACGATCGC 


CAGTTCTGTA 


3360 


TGAACGGTCT 


GGTCTTTGCC 


GACCGCACGC 


CGCATCCAGC 


GCTGACGGAA 


GCAAAACACC 


3420 


AGCAGCAGTT 


TTTCCAGTTC 


CGTTTATCCG 


GGCAAACCAT 


CGAAGTGACC 


AGCGAATACC 


3480 






GAGCTCCTGC 


ACTGGATGGT 


GGCGCTGGAT 


GGTAAGCCGC 


3540 


TGGCAAGCGG 


TGAAGTGCCT 


CTGGATGTCG 


CTCCACAAGG 


TAAACAGTTG 


ATTGAACTGC 


3600 
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CTGAACTACC GCAGCCGGAG AGCGCCGGGC 
CGAACGCGAC CGCATGGTCA GAAGCCGGGC 
CGGAAAACCT CAGTGTGACG CTCCCCGCCG 
GCGAAATGGA TTTTTGCATC GAG CTGGGT A 
GCTTTCTTTC ACAGATGTGG ATTGGCGATA 
AGTTCACCCG TGCACCGCTG GATAACGACA 
CTAACGCCTG GGTCGAACGC TGGAAGGCGG 
TGCAGTGCAC GGCAGATACA CTTGCTGATG 
AGCATCAGGG GAAAACCTTA TTTATCAGCC 
AAATGGCGAT TACCGTTGAT GTTGAAGTGG 
GCCTGAACTG CCAGCTGGCG CAGGTAGCAG 
AAGAAAACTA TCCCGACCGC CTTACTGCCG 
CAGACATGTA TACCCCGTAC GTCTTCCCGA 
AATTGAATTA TGGCCCACAC CAGTGGCGCG 
GTCAACAGCA ACTGATGGAA ACCAGCCATC 
GGCTGAATAT CGACGGTTTC CATATGGGGA 
TATCGGCGGA ATTGCAGCTG AGCGCCGGTC 
AATAATAATA ACCGGGCAGG CCATGTCTGC 
ACTATTTCTA GAGAATTCCC CCCTCTCCCT 
GCTTGGAATA AGGCCGGTGT GCGTTTGTCT 
TTGGCAATGT GAGGGCCCGG AAACCTGGCC 
TTTCCCCTCT GCGCAAAGGA ATGCAAGGTC 
TGGAAGCTTC TTGAAGACAA ACAACGTCTG 
CACCTGGCGA CAGGTGCCTC TGCGGCCAAA 
CGGCACAACC CCAGTGCCAC GTTGTGAGTT 
CCTCAAGCGT ATTCAACAAG GGGCTGAAGG 
CTGATCTGGG GCCTCGGTGC ACATGCTTTA 
AGGCCCCCCG AACCACGGGG ACGTGGTTTT 
AGCTCCTAGG CTTTTG C AAA AAGCTCCCGG 
AAGAGACAGG ATGAGGATCG TTTCGCATGA 
CGGCCGCTTG GGTGGAGAGG CTATTCGGCT 
CTGATGCCGC CGTGTTCCGG CTGTCAGCGC 
ACCTGTCCGG TGCCCTGAAT GAACTGCAGG 
CGACGGGCGT TCCTTGCGCA GCTGTGCTCG 
TGCTATTGGG CGAAGTGCCG GGGCAGG ATC 
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AACTCTGGCT 


CACAGTACGC 


GTAGTGCAAC 


3660 


ACATCAGCGC 


CTGGCAGCAG 


TGGCGTCTGG 


3720 


CGTCCCACGC 


CATCCCGCAT 


CTGACCACCA 


3780 


ATAAGCGTTG 


GCAATTTAAC 


CGCCAGTCAG 


3840 


AAAAACAACT 


GCTGACGCCG 


CTGCGCGATC . 


3900 


TTGGCGTAAG 


TGAAGCGACC 


CGCATTGACC 


3960 


CGGGCCATTA 


CCAGGCCGAA 


GCAGCGTTGT 


4020 


CGGTGCTGAT 


TACGACCGCT 


CACGCGTGGC 


4080 


GGAAAACCTA 


CCGGATTGAT 


GGTAGTGGTC 


4140 


CGAGCGATAC 


ACCGCATCCG 


GCGCGGATTG 


4200 


AGCGGGTAAA 


CTGGCTCGGA 


TTAGGGCCGC 


4260 


CCTGTTTTGA 


CCGCTGGGAT 


CTGCCATTGT 


4320 


GCGAAAACGG 


TCTGCGCTGC 


GGGACGCGCG 


4380 


GCGACTTCCA 


GTTCAACATC 


AGCCGCTACA 


4440 


GCCATCTGCT 


GCACGCGGAA 


GAAGGCACAT 


4500 


TTGGTGGCGA 


CGACTCCTGG 


AGCCCGTCAG 


4560 


GCTACCATTA 


CCAGTTGGTC 


TGGTGTCAAA 


4620 


CCGTATTTCG 


CGTAAGGAAA 


TCCATTATGT 


4680 


cccccccccc 


TAACGTTACT 


GGCCGAAGCC 


4740 


ATATGTTATT 


TTCC AC CAT A 


TTGCCGTCTT 


4800 


CTGTCTTCTT 


GACGAGCATT 


CCTAGGGGTC 


4860 


TGTTGAATGT 


CGTGAAGGAA 


GCAGTTCCTC 


4920 


TAGCGACCCT 


TTGCAGGCAG 


CGGAACCCCC 


4980 


AGCCACGTGT 


ATAAGATACA 


CCTGCAAAGG 


5040 


GGATAGTTGT 


GGAAAGAGTC 


AAATGGCTCT 


5100 


ATGCCCAGAA 


GGTACCCCAT 


TGTATGGGAT 


5160 


CATGTGTTTA 


GTCGAGGTTA 


AAAAACGTCT 


5220 


CCTTTGAAAA 


ACACGATGAT 


AATATGGCCA 


5280 


GAGCTTGGAT 


ATCCATTTTC 


GGATCTGATC 


5340 


TTGAACAAGA 


TGGATTGCAC 


GCAGGTTCTC 


5400 


ATGACTGGGC 


ACAACAGACA 


ATCGGCTGCT 


5460 


AGGGGCGCCC 


GGTTCTTTTT 


GTCAAGACCG 


5520 


ACGAGGCAGC 


GCGGCTATCG 


TGGCTGGCCA 


5580 


ACGT I v» 1 Unv- 




AGGGACTGGC 


5640 


TCCTGTCATC 


TCACCTTGCT 


CCTGCCGAGA 


5700 



wo 
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AAGTATCCAT 


CATGGCTGAT 


GCAATGCGGC 


GGCTGCATAC 


GCTTGATCCG 


GCTACCTGCC 


5760 


CATTCGACCA 


CCAAGCGAAA 


CATCGCATCG 


AGCGAGCACG 


TACTCGGATG 


GAAGCCGGTC 


5820 


TTGTCGATCA 


GGATGATCTG 


GACGAAGAGC 


ATCAGGGGCT 


CGCGCCAGCC 


GAACXGTTCG 


5880 


CCAGGCTCAA 


GGCGCGCATG 


CCCGACGGCG 


AGGATCTCGT 


CGTGACCCAT 


GGCGATGCCT 


5940 


GCTTGCCGAA 


TATCATGGTG 


GAAAATGGCC 


GCTTTTCTGG 


ATTCATCGAC 


TCTGGCCGGC 


6000 


TGGGTGTGGC 


GGACCGCTAT 


CAGGACATAG 


CGTTGGCTAC 


CCGTGATATT 


GCTGAAGAGC 


6060 


TTGGCGGCGA 


ATGGGCTGAC 


CGCTTCCTCG 


TGCTTTACGG 


TATCGCCGCT 


CCCGATTCGC 


6120 


AGCGCATCGC 


CTTCTATCGC 


CTTCTTGACG 


AGTTCTTCTG 


AGCGGGACTC 


TGGGGTTCGC 


6180 


CTTGACTTGC 


TGTTTCTAAA 


AGAAGGTGGC 


CTCTGTGCGG 


CCCTAAAGGA 


AGAGTGCTGT 


6240 


TTTTACATAG 


ACCACTCAGG 


TGCAGTACGG 


GACTCCATGA 


AAAAACTCAA 


AGAAAAACTG 


6300 


GATAAAAGAC 


AGTTAGAGCG 


CCAGAAAAGC 


CAAAACTGGT 


ATGAAGGATG 


GTTCAATAAC 


6360 


TCCCCTTGGT 


TCACTACCCT 


GCTATCAACC 


ATCGCTGGGC 


CCCTATTACT 


CCTCCTTCTG 


6420 


TTGCTCATCC 


TCGGGCCATG 


CATAGGGAAG 


GTG CCTCTT A 


CCCATCAACA 


TCTTTGCAAC 


6480 


CAGACCTTAC 


CCATCAATTC 


CTCTAAAAAC 


CATCAGTATC 


TGCTCCCCTC 


AAACCATAGC 


6540 


TGGTGGGCCT 


GCAGCACTGG 


CCTCACCCCC 


TGCCTCTCCA 


CCTCAGTTTT 


TAATCAGTCT 


6600 


AAAGACTTCT 


GTGTCCAGGT 


CCAGCTGATC 


CCCCGCATCT 


ATTACCATTC 


TGAAGAAACC 


6660 


TTGTTACAAG 


CCTATGACAA 


ATCACCCCCC 


AGGTTTAAAA 


GAGAGCCTGC 


CTCACTTACC 


6720 


CTAGCTGTCT 


TCCTGGGGTT 


AGGGATTGCG 


GCAGGTATAG 


GTACTGGCTC 


AACCGCCCTA 


6780 


ATTAAAGGGC 


CCATAGACCT 


TCAGCAAGGC 


CTAACCAGCC 


TCCAAATCGC 


CATTGACGCT 


6840 


G AC CTCCGGG 


CCCTTCAGGA 


CTCAATCAGC 


AAGCTAGAGG 


ACTCACTGAC 


TTCC CTATCT 


6900 


GAGGTAGTAC 


TCCAAAATAG 


GAGAGGCCTT 


GACTTACTAT 


TCCTTAAAGA 


AGGAGGCCTC 


6960 


TGCGCGGCCC 


TAAAAGAAGA 


GTGCTGTTTT 


TATGTAGACC 


ACTCAGGTGC 


AGTACGAGAC 


7020 


TCCATGAAAA 


AACTTAAAGA 


AAGACTAGAT 


AAAAGACAGT 


TAGAGCGCCA 


GAAAAACCAA 


7080 


AACTGGTATG 


AAGGGTGGTT 


CAATAACTCC 


CCTTGGTTTA 


CTACCCTACT 


ATCAACCATC 


7140 


GCTGGGCCCC 


TATTGCTCCT 


CCTTTTGTTA 


CTCACTCTTG 


GGCCCTGCAT 


CATCAATAAA 


7200 


TTAATCCAAT 


TCATCAATGA 


TAGGATAAGT 


GCAGTCAAAA 


TTTTAGTCCT 


TAGACAGAAA 


7260 


TATCAGACCC 


TAGATAACGA 


GGAAAACCTT 


TAATTTCGCT 


CTAAGATTAG 


AGCTATCCAC 


7320 


AAGAGAAATG 


GGGGAATGAA 


AGAAGTGTTT 


TTCAAGTTAG 


CTGCAGTAAC 


GCCATTCATA 


7380 


AGGCACGCCC 


AAAGCATAAA 


GGTTAAAGAA 


GAAAAAAACC 


GGGCCAAACA 


GGATATCTGT 


7440 


GGTCATACAC 


CTGGAACCCG 


GCCCAGGGCC 


AAACACAGAT 


GGTTCCCAGA 


AATAAAATGA 


7500 


GTCAACAGCA 


GTTTCCAGGG 


TGCCCCTCAA 


CTGTTTCAAG 


AAACTCCCAT 


GACCGGAGCT 


7560 


CACCCCTGAC 


TTATTTGAAC 


TAACCAATCA 


CCTTGCTTCT 


CGCTTCTGTA 


CCCGCGCTTT 


7620 


TTGCTATAAA 


AGGAGCTCAG 


AAATTCCACT 


CGGCGCGCCA 


GTCTTCCAAG 


AGACTGAGTC 


7680 






AATAAAACCT 


CTTGCTACTT 


GCATCCGAAG 


TCGTGGTCTC 


7740 


GCTGTTCCTT 


GGGAAGGTCT 


CCCCTAATTG 


ATTGACCGCC 


CGGACTGGGG 


GTCTCTCATT 


7800 
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GGAATTCATC GATGATATCA GCCAATTCGC 
CCGTCGTTTT ACAACGTCGT GACTGGGAAA 
CAGCACATCC CCCTTTCGCC AGCTGGCGTA 
CCCAACAGTT GCGCAGCCTG AATGGCGAAT 
ATTCGCGTTA AATATTTGTT AAATCAGCTC 
AATCCCTTAT AAATCAAAAG AATAGACCGA 
CAAGAGTCCA CTATTAAAGA ACGTGGACTC 
GGGCGATGG C CCACTACGTG AACCATCACC 
TAAAGCTCTA AATCGGAACC CTAAAGGGAG 
GGCGAACGTG GCGAGAAAGG AAGGGAAGAA 
AAGTGTAGCG GTCACGCTGC GCGTAACCAC 
GGGCGCGTCG CCTGATGCGG TATTTTCTCC 
TATGGTGCAC TCTCAGTACA ATCTGCTCTG 
CGCCAACACC CGCTGACGCG CCCTGACGGG 
AAGCTGTGAC CGTCTCCGGG AGCTGCATGT 
GCGCGAGACG AAAGGGCCTC GTGATACGCC 
TGGTTTCTTA GACGTCAGGT GGCACTTTTC 
TATTTTTCTA AATACATTCA AATATGTATC 
TTCAATAATA TTGAAAAAGG AAGAGTATGA 
CCTTTTTTGC GGCATTTTGC CTTCCTGTTT 
AAGATGCTGA AGATCAGTTG GGTGCACGAG 
GTAAGATCCT TGAGAGTTTT CGCCCCGAAG 
TTCTGCTATG TCATACACTA TTATCCCGTA 
GGGCGCGGTA TTCTCAGAAT GACTTGGTTG 
CGGATGGCAT GACAGTAAGA GAATTATGCA 
CGGCCAACTT ACTTCTGACA ACGATCGGAG 
ACATGGGGGA TCATGTAACT CGCCTTGATC 
CAAACGACGA GCGTGACACC ACGATGCCTG 
TAACTGGCGA ACTACTTACT CTAGCTTCCC 
ATAAAGTTGC AGGACCACTT CTGCGCTCGG 
AATCTGGAGC CGGTGAGCGT GGGTCTCGCG 
AGCCCTCCCG TATCGTAGTT ATCTACACGA 
ATAGACAGAT CGCTGAGATA GGTGCCTCAC 
TTTACTCATA TATACTTTAG ATTGATTTAA 
TGAAGATCCT TTTTGATAAT CTCATGACCA 



51 


# 


PCT/US94/03784 




CCTATAGTGA 


GTCGTATTAC 


AATTCACTGG 


7860 


ACCCTGGCGT 


TACCCAACTT 


AATCGCCTTG 


7920 


ATAGCGAAGA 


GGCCCGCACC 


GATCGCCCTT 


7980 


GGAAATTGTA 


AACGTTAATA 


TTTTGTTAAA 


8040 


ATTTTTTAAC 


CAATAGGCCG 


AAATCGGCAA . 


8100 


GATAGGGTTG 


AGTGTTGTTC 


CAGTTTGGAA 


8160 


CAACGTCAAA 


GGGCGAAAAA 


CCGTCTATCA 


8220 


CAAATCAAGT 


TTTTTGCGGT 


CGAGGTGCCG 


8280 


CCCCCGATTT 


AGAGCTTGAC 


GGGGAAAGCC 


8340 


AGCGAAAGGA 


GCGGGCGCTA 


GGGCGCTGGC 


8400 


CACACCCGCC 


GCGCTTAATG 


CGCCGCTACA 


8460 


TTACGCATCT 


GTGCGGTATT 


TCACACCGCA 


8520 


ATGCCGCATA 


GTTAAGCCAG 


CCCCGACACC 


8580 


CTTGTCTGCT 


CCCGGCATCC 


GCTTACAGAC 


8640 


GTCAGAGGTT 


TTCACCGTCA 


TCACCGAAAC 


8700 


TATTTTTATA 


GGTTAATGTC 


ATGATAATAA 


8760 


GGGGAAATGT 


GCG CGGAACC 


CCTATTTGTT 


8820 


CGCTCATGAG 


ACAATAACCC 


TGATAAATGC 


8880 


GTATTCAACA 


TTTCCGTGTC 


GCCCTTATTC 


8940 


TTGCTCACCC 


AGAAACGCTG 


GTGAAAGTAA 


9000 


TGGGTTACAT 


CGAACTGGAT 


CTCAACAGCG ;p 


9060 


AACGTTTTCC 


AATGATGAGC 


ACTTTTAAAG 


9120 


TTGACGCCGG 


GCAAGAGCAA 


CTCGGTCGCC 


9180 


AGTACTCACC 


AGTCACAGAA 


AAGCATCTTA 


9240 


GTGCTGCCAT 


AACCATGAGT 


GATAACACTG 


9300 


GACCGAAGGA 


GCTAACCGCT 


TTTTTGCACA 


9360 


GTTGGGAACC 


GGAGCTGAAT 


GAAGCCATAC 


9420 


TAGCAATGCC 


AACAACGTTG 


CGCAAACTAT 


9480 


GGCAACAATT 


AATAGACTGG 


ATGGAGGCGG 


9540 


CCCTTCCGGC 


TGGCTGGTTT 


ATTGCTGATA 


9600 


GTATCATTGC 


AGCACTGGGG 


CCAGATGGTA 


9660 


CGGGGAGTCA 


GGCAACTATG 


GATGAACGAA 


9720 


TGATTAAGCA 


TTGGTAACTG 


TCAGACCAAG 


9780 


rvrVl* X X wn XXX 


TTAATTTAAA 


AGGATCTAGG 


9840 


AAATCCCTTA 


ACGTGAGTTT 


TCGTTCCACT 


9900 
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GAGCGTCAGA 


CCCCGTAGAA 


AAGATCAAAG 


TAATCTGCTG 


CTTGCAAACA 


AAAAAACCAC 


AAGAGCTACC 


AACTCTTTTT 


CCGAAGGTAA 


CTGTCCTTCT 


AGTGTAGCCG 


TAGTTAGGCC 


CATACCTCGC 


TCTGCTAATC 


CTGTTACCAG 


TTACCGGGTT 


GGACTCAAGA 


CGATAGTTAC 


GGGGTTCGTG 


CACACAGCCC 


AGCTTGGAGC 


AGCGTGAGCT 


ATGAGAAAGC 


GCCACGCTTC 


TAAGCGGCAG 


GGTCGGAACA 


GGAGAGCGCA 


ATCTTTATAG 


TCCTGTCGGG 


TTTCGCCACC 


CGTCAGGGGG 


GCGGAGCCTA 


TCGAAAAACG 


CCTTTTG CTG 


GCCTTTTGCT 


CACATGTTCT 


ACCGTATTAC 


CGCCTTTGAG 


TG AG CTG AT A 


GCGAGTCAGT 


GAGCGAGGAA 


GCGGAAGAGC 


GTTGGCCGAT 


TCATTAATGC 


AGCTGGCACG 


AGCGCAACGC 


AATTAATGTG 


AGTTAGCTCA 


TGCTTCCGGC 


TCGTATGTTG 


TGTGGAATTG 


GCTATGACCA 


TGATTACGCC 


AAGCTATTTA 
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GATCTTCTTG 


AGATCCTTTT 


TTTCTGCGCG 


9960 


CGCTACCAGC 


GGTGGTTTGT 


TTGCCGGATC 


10020 


CTGGCTTCAG 


CAGAG CGCAG 


ATACCAAATA 


10080 


ACCACTTCAA 


GAACTCTGTA 


GCACCGCCTA 


10140 


TGGCTGCTGC 


CAGTGGCGAT 


AAGTCGTGTC 


10200 


CGGATAAGGC 


GCAGCGGTCG 


GGCTGAACGG 


10260 


GAACGACCTA 


CACCGAACTG 


AG AT AC CT AC 


10320 


CCGAAGGGAG 


AAAGG CGG AC 


AGGTATCCGG 


10380 


CGAGGGAGCT 


TCCAGGGGGA 


AACGCCTGGT 


10440 


TCTGACTTGA 


GCGTCGATTT 


TTGTGATGCT 


10500 


CCAGCAACGC 


GGCCTTTTTA 


CGGTTCCTGG 


10560 


TTCCTGCGTT 


ATCCCCTGAT 


TCTGTGGATA 


10620 


CCGCTCGCCG 


CAGCCGAACG 


ACCGAGCGCA 


10680 


GCCCAATACG 


CAAACCGCCT 


CTCCCCGCGC 


10740 


AC AGGTTTC C 


CGACTGGAAA 


GCGGGCAGTG 


10800 


CTCATTAGGC 


ACCCCAGGCT 


TTACACTTTA 


10860 


X unuvwun X A 




ACAGGAAACA 


10920 


GGTGACACTA 


TAGAATACTC 




10970 
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WHAT IS CLAIMED IS : 

1. A recombinant DNA construct comprising a 
defective viral genome comprising a polynucleotide sequence of 

5 interest and a gibbon ape leukemia virus (GaLV) component. 

2. The construct of claim 1, wherein the GaLV 
component includes a GaLV packaging site. 

10 3. The construct of claim 2, wherein the packaging 

site consists of between about 150 base pairs and about 1500 
base pairs. 

4. The construct of claim 2, wherein the packaging 
15 site consists essentially of a sequence extending from about 

position 200 to about position 910 of the sequence shown in 
Figure 1. 

5. The construct of claim 1, wherein the GaLV 
20 component includes regulatory sequences which direct 

expression of the polynucleotide of interest • 

6o The construct of claim 5, wherein the 
regulatory sequences are from a GaLV 3 1 LTR. 

25 

7, The construct of claim 6, wherein the promoter 
is from GaLV SF. 

8, A mammalian cell comprising the defective viral 
30 genome of claim !• 

9, The cell of claim 8, further comprising 
retroviral gag and pol genes. 



35 



10. The cell of claim 9, wherein the gag and pol 
genes are from GaLV SF or GaLV SEATO. 
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11. The cell of claim 9, wherein the gag and pol 
genes are from MoMLV. 

12. The cell of claim 8, further comprising a 
5 retroviral env gene. 

13. The cell of claim 12, wherein the env gene is 
from GaLV SF or GaLV SEATO. 



10 



15 



14. The cell of claim 8, which is PG13 or PA317. 

15. An isolated hybrid virion comprising GaLV 
envelope proteins and an RNA genome comprising a 
polynucleotide sequence of interest and a GaLV component. 

16. The virion of claim 15, further comprising GaLV 
core proteins. 

17. The virion of claim 15, further comprising 
20 MoMLV core proteins. 

18. The virion of claim 15, wherein the envelope 
proteins are GaLV SF proteins. 

25 19. The virion of claim 15, wherein the GaLV 

sequence includes a packaging site. 

20. The virion of claim 19, wherein the packaging 
site is transcribed from a sequence consisting of between 

30 about 150 base pairs and about 1500 base pairs. 

21. The virion of claim 19, wherein the packaging 
site is transcribed from a polynucleotide sequence extending 
from about position 200 to about position 910 of the sequence 

35 shown in Figure 1. 



22. An isolated recombinant DNA construct 
comprising a polynucleotide sequence which encodes an 
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infectious GaLV virion capable of infecting a mammalian cell 
and producing infectious viral progeny. 

23. The construct of claim 22, wherein the 

5 DNA construct comprises GaLV SF sequences and GaLV SEATO 

sequences . 

24. The construct of claim 23, wherein the DNA 
construct comprises 97% GaLV SEATO sequences and 3% GaLV SF 

10 sequences. 

25. A method of introducing a polynucleotide of 
interest into human cells having a GaLV receptor, the method 
comprising s 

15 contacting the cells with hybrid virions comprising 

GaLV envelope proteins and an RNA genome comprising the 
polynucleotide sequence of interest and a GaLV packaging site; 
and 

selecting cells having the polynucleotide of 

20 interest. 

26. The method of claim 25, further comprising 
implanting the cells in a human patient • 



25 



27. The method of claim 25, wherein the human cells 
are selected from the group consisting of bone marrow cells 
and tumor infiltrating cells. 
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GENETIC ORGANIZATION OF GaLV 
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CCCCCA^iw^ Ur i r A f U C T ft A CCC C CCC CCCTACCCW 

• QS ^ai • MS ioo 

cecccc*CTTCgccccTCTCA CfcTTTeec c ccTccTCccm T rrj- — C n re ** "" ^^"-^ i — ttttt 

• ■ . .200 

TCTCTAACTCCCTA*TTCTCALlUm.llClLiwiw>w.ilCILH,IHI * r i rfT TAAgtgTCAgreeeeTTgTAACT ^jw-^^ i i i i i | ■, , ,,,, Kt ,, | t . f | f fcc gw .>.^.. 

cTCTCCCACTCACCTCTCA CT i rrrnr i r i rCT CT TOCCcra i 

400 

ciu.iiuAC *cecc*TTmTccccccrcT6CT*fc i J UT . rr . r rcTc^ 

MO . a—OtS 

| a*» natCl pCi«*«ftUft««mirPr«Il«Url««TtiffLa«Aa 



■ lLiUU,lULUUlUUUMlUC CAAAC lLlLli:i 
• 400 

• • • ;oo 

A^u< rf/- i*i** wf fc r ii iiitiLiAU,iix c4CT Trfrr * r iTTCccccTcccct rrrr i rrrr i rrr iu Liin AA iLiLiLi 
ciwA*ACiirCir«t«»ra*«aCi«vai>T«Trrti»^ 

^■r.irr^rrrir*Tri^*inr**/rrrei^TATATeeTegT A Tri-f irr irwfMWir» w^.^^^ rf >r(T ^ f f | [ J, "j^ ;^^;: 
»r»v«UUCt»Artr^WraU»™Pt^ri»r»rT»ilaTTr 

rj^urrrry rrrr.Arj>/-r i tt.al.i_ i l± * *rrrrrr*Tr?rtrrrrj-rnr*r*rr-*nFT±i ■ , ^.^ mrfWm T — ^ iniiuimrTOw 

1000 



SCACTOCCACCCCCTCTC 
AUy^^i»JUaliaCty.^l*aa«rClirCta*at^ 

■uoopao .iaoo 

a*rTlirV«U UUy«i»I.^A T iMapa$W^^ 

TCCACTCimTTTTcecccT rrr iA flrr A 1 1 rr i rrrrrrrrrr i rrrr ■ ■ itaraSiraS^ 

• • t • . ■ . "~ |)00 

»arr^Ur ClyAa»»rw*l**lTLa«ThiCH 

» " ar^A^j^ TrT fA^ffrmj^ -r^wBeAi-Tt 1 n iii f p . w c f i f ^ * TT* Tff*rfl f TT ^rCff i ff A TT **fTfl f l Tfl f fT 1 *" *Ti*f ft n T TTl ' t T^f i* f » ^ y f f ftA f ff 

• 1400 



AiaAl^tvAra«ftaW.val-B*LayValf**A* 

rrrrr t m h rrra 1 1 ili iLiL^iLiATTT-rrrrarTCTAgrg^fcCT 

tooo ...... . ~ 

rrato»«rva.»b«t<«^l«*rBl*»wl*MCl«AlaTvr*r 

ill i ii^ TTTTtrrT .r.ii^ri- r^T^^ee^ 

**oo • * . . . iaoo 

A««tWtTiLyaLv«l««Cl^rtl*wCliiCl«I*^ 
CATATCAMJUUAACnACJUMCCCtMUCCCCCTCC^^ 



ClwLv«tv»Clt^l^lwCl«LTsCluArtArpAFftA«Mrt^v« L TaUTa L y»Aaiti«wTlirL?*tl«L«ttAlttAl« 

r.i»..tffirrpif»fi*»ffiri/wwrf^itif«rr*tfi.*t.i.iwtMW^ rT 



IfOO 

ratUCly»arT*rCWAr|CUThrClyA«» 



L««4«rAa^laAUl^al^T»TrT^ 

f)4)f)^aj aa» O0)i 3100 TTCCCfiAACACAATCTCCeCCAAAAAAACACCTC 

Art£l«AULvaVall*»AJai-a^^a*ar>***cir**rCl^ 

a fl . C M rrf»rr TTCTA«CCTA_UT^^ 

1200 . 

AlaCliiMiaBarvaiLMnrCtyr^yMfttC-vL^^ 

eCTCAACATTi^TATT rarrrairrri ttmAAACTACceT rririrrrirrrrrrtrr rrif»rrf.rf»r^ i ^.rir^... 1 ;» J TTTTMMTTrci 

2100 2400 
Mal»aCUValT»rHt aaa*r+»l«y«aU U»^ 

gAT »A«r**rT TgAgggA i i u_iiii_ii J Lii *Tirfrrirre^fTLfTmiiiLi ii^ jri-Airr^««^"w»«-»>. < ->> rnn ^ TJ . A .^i»^ < . f , , , , 1 1 1 . | t . A<w .j^>..> M >»i ». 

* . * * 2 MO 

TtirTraCl rClvA r «F*eTtem»tC*« Laa »a ILmam LtaClyC W61 »t t rAr iLavM I sC 1 «Lv«Fr«ValrT»»«rS«rI UAa»rraft*rTr»L«wCl«L*«ptMrrenirV«l 
ACA trrrr i r * ■ rrrrrT ACTATCTCCCTCCTCCTAAACCT Cr » (I f ■ i f M TACCCACTACAT TW I i.M C tC A C TACt CTCCTCTATCCACCCATCCTCCCtCCACCTT^CCCCACTCTA 

..... 2400 . . 

T raAlatayArejkUUtvftaiia *L««Al«A«a£lnVa 1 PftPMft 1 ValValt;iM^iiAr»*»rCUAU»«rrr»V«lAlaV*lArtCiaTrFrTa^«l«rl»«ci«AlaAr»Clu 
T rrrr_r»._r.rrrrrr. Ter_f^CTA^^ 

"00 ...... 

Cl*UaAr v r>oHtaU«ClftL«ar»*i«y**Bl*^ 

2400 . 

CAW»t^A f a^litlUA4«ty>Ar#ValCiwA^pl>aiita»?»T»rVal»f»A«ti>f jfTrAA w UaLaaWrWyi^y»r«rr»t^yTvrThFTF»TyrMr»aliamAa»ijaA.r» 
CAACACCT r I ff i f M I TTAAI ^UUtfCCTAfcACCATATTCATCCCACACTCCCAAACCCTTACAATCTTCTCAC 1 1LCLI I tCCCCT ACCT AT ACTTCCTACTCACTCTI ACATCTCAAC 
2900 JOOO 
AaM UWwWaaC yataaA^gWayiUarra^aM^ 

UATcxcTTTrTCTCceTCAgccTAAUT Cffr M 

3 tOO 

Lva*a*S«rrroTliri^«*haAaaClyW-lal^wHtaArMaa4^MAIar 

AACAArTeTeccACTCTCTT rr i rr * rrrrr r rr A rrfi A fl A T TTCCCTCCCTTTACcccccT r * i rrrrr»n _i-l.il, 1 1 AgTt^^TATrriT^i»f-ArrTrTw^r < >^rrrr^-r>r r _ 

1200 .... 

T*rClaMK*alvakvaCWTfcrClnLva-^tJ_a*C-f-^ 

mr^r^rvrr .rr > ar .r i r . ^rrTCTTAfACgACTTAA^TAAgTTf^e^TAfcgegrTATrr^f^T * >r-> «y^r _ rp ^ FTI *'* y '* f * r ^ r * ***TrftCTTtTTTrt* rTTflCrTfl 

«oo ...... 

La«LvaClyCl*L«a*r«Tr»LattT*rPraAlaArtLMAlaT^^ 

CTr>fci rfcw< A A A ■ *r ATCCCT AA CCCr I frCCCAAAtXCTACTCTTATCAAAATCCCTCTTCCTA TH AfCCCC* C *C *rCTCCCTCJUiTTTCTACCCACTCCCCCATTCTCCACCCTC 

UOO . . . . . - , 

Trsl l«»r«cl«Pn«Ala»«rkattAlaAlaPr»L«wTvrrraL«MThrLva4ilHS«rllatroPh«t laTr»1hrCluClwHltClnCLnAl*PlMA«»lliall«Lv«Ly«AlaLa»La« 
Tr_t_ATPrrT__i:/rTTTgirTT-^gT--^..T_ICACCXTTCTACCCCC'Il *^ A A fc Af 11 _^r>TgCrrTTTA^TTr_£A/^r A_^r. _»->r_.Tr Arr _.rf»r-r^^^/- iff . „ . , .. 

>W0 . . . . J400 

WrAlaPr^lal^wAiauwFrwuakayThrLvarroPhaTfcrL*^ 

TCAA^n^cTi^ATTcccegTcccAiiAgeT r a rr * A ftf CATTCACTCTATaTATacATCACA£AcreccccTce rr rrrrr irr r_._"rt_ACTr^Ar>TTT._/_^^ 

1T00 

Alal*rLmiarLvatvala«A*a*roVa iAtaSarClyTr»froT1tr£«aL«uLvaAlaValAtaAUValAUU«l«w^ 

1100 .... 

ThrvalltaAUWrittaWrt«ttCluS«rlUVjlArtClftrro»rM^ 

ACTCT<lATTcCTtCCCAT ACCCT C C A M f r » TCl^CXCCCAACCCCCCCACCCCTCCAtt A fC A A T T<T A.. A A TCACTCATTACCA__ACCCTCCTCTTAAAtCAAA-XCTATCCTTTCCC 

>f00 ...... 

•ro»roAl»Val4««A«a»rnAlaTiirkawk«u»rffVA !CliiS«rClMAlaThrProV«|HtaAr(C*aS«rCl«ll«L«iaiaClwCl«ri»rCtvThrAraArtAaa4«aCI«^aCla 

rrri-rTOTCTrfTAAAgfeAi^TAggeTAgTTceACTfcLAi:T ri i i r>r fjgACT frArAr.f Tr-rT f ir tii Tf rrre rr. ttf ^A^ r^AA ^r^^^ nr r > . riV ^A r?M 

4O00 ........ 
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208 DELASSUS, SONJGO. AND WAIN-HOBSON 

froL«urr»ClyV«irraTfirTrpTvrThrA«»ClvWrS«rrh«l UntrClyCi»C»»*r«Ar.Ai.ei *. . .. 
CCATTCCCCCCCCTtXt^J^CTCCTAlACiTLfcCm^^ 

4100 1 rr 1 rrccc ccccATccTACATfty* ir rrr ■ ccctatcccctaccaccctccca 

Cl«ClvThrS.cAi«ClnLv»Al*CUl^yV«lAl«L»yThrCl#UI«l^,*, g ^ , _ *2O0 

&u^TACCTuuxcr«CAA CCC TCAACTACTACcm c A rr^ 

C ATCAACATCTa rTr r i r ■ r r i cct atccttttccc4Ctcctcatatt 

Hi.C^PfoClvMiiCliUrfClyl-rA^nPf^V.UUnirCiyAAiUrtArtAlaA^^lyAl^l-L-.tii-^t^?? , . ~ ' 

^CTgTC C T^rirr*r.rrrr..rT^rr~^ 

4*00 illirMTTCAriAACCCCTTA^TCAraAAi^A^ 

ArtC»«r|CI»»«»»» t »coClf»«ITTrTr»Cl«»«u.»»MT)irei«IUtw>»iei,*r.Tr»ei»*.«L..Ti.i^-^_. I~ .. 4,80 



. T ..n,.,..n. l .,. W ,..,., ) , MI , yl |, y , nlMMtlM(;1)tUtUt>rltriKan>t| .^t.^. __, _ 

CCTTTT eCTACCAA^CTCAA^C CCCCTM^ 



« r * C AA T CC CCUXU.I I III I OCT 



CAflCTA^CACttACTCCCCACTCAA^ 



■ • « . . 1100 ' nmw n— — fcbUW Afc 

^•^^•^«««*TMClKmir>A*pTr»V.tThrUt^^^ ~ 
AJUnACCOTACJUJtfCCCTQUAAA4UCTtt^^ 

S200 Irr w«CCT6CeCK lliU.ll iAACTCCTTATCAiUTTCTCTATCCACCA 

>roProProlULMiCUt«rCl vClMtlifunrflyfrtamMnnMU^riy^^^ ,., 1 t 

CCACCCCCCATACTTrU^ 

Cl«v*iTvrtv»FroCl¥ThrV«lThrlurf»lll.rr»W»«CIiiW - . * 4fi0 

Nl«rret*yl.T*b«MArtll«A«sAriArtA«»Aa*Clttl«rAi«L7» • 

Ly*Al*V<lClnrt^PToTr»TtifTi>Ti^ProThrl^yLy«fr«A«>V«lC»aAl»L.^Ai>Ai>c^li^^. ■ .* . _ 

.-ccc^ec-cccccncc^Tt^^ 

• _ ' . ftJQO 



CCTCtfttTCACTCACOTCTCAa^C^^ 
Pr«WrA*iiHi*to»Tr»Tf^lACMiim?Clyl^» 1 ThrrToCT»L«M»«rThf^rTAlFi«A«lci^.Jf^-»^* .1 . 

cc^cc^crccreec^^ 

cx.TCCTC^cTmm^ccccT.TC^^^cccc^cT, ■ . ■r? , . l : r c ,c:^ ^^^^SS^^2 r 

C=TceTATC«Cr«CeCTC«=CCCT*TT^ 

TCCTCCm^AttT^CCCCCTAC^ :-U3 || >; ; ■ 

itcc^i^CMCwwwaaw^^ ; ; 1™ 

«TTC*CCCCTecCCTTAmCMCtA*C^*CCTTCCCTCWmCT?Aec^ 



*^CCCCCCCaXTACCCCTCTCTCCAATAAAACCTCTTCCTCATTCCA 



Fig. 1 — Continued 
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FIGURE 2A 
STEP 



REPAIR OF POL GENE OF CALV SEATO 



I ool 



Sal 



LTR'a 



onv 



Gauv-SEATO permuted provtrus 



GaLV-SF 250bp 
pol sequence 



WW 



WW 




Sal 



WW 



env I 



■km 



Sal 



Hi Hi 




Sal 



Sal 




Sal 



sal 



Sal 



H3 KJH3W 



Sal 
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FIGURE 2B CHANCE OF OALV SEATO HJ9ERT ORIENTATION 
STEP 




a 




INTERMEDIATE CLONE 120 




INTERMEDIATE CLONE CO 
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TTruRF 2C INTERMEDIATE CLONE 80: UNIDIRECTIONAL DECREASE 
FIGURE 2C JJJ TmSStT LENGTH USING EXONUCLEASES III AND VII 

STEP 




~ 5.4KD 
INTERMEDIATE CLONE 68 



Not 



10 
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pVZ-1 



Incomplete 
Not I 



sat 



14 




PV2-1 



incomplete 
Not I 



16-17 




incomplete Not I 



Ineotnpiate Not 
-NSQSCCQC* 
-ISNCCQGOQ' 



18 




sai 



within 
outer LTR 

Within 
mner LTR 

Within 
onvoene 



(b) 



Inoomplete Not I 

.MSGGCCGC — - 
-M*C0GGCG — 



Severely damaged Net I 

'MMMNBVfsGC 
-NN*I*JCG 



Sal 



Complete Not I site 



19.20 



Complete Not I site 

-QCGQCOQC 

-0GCCQ3QG 



INTERMEDIATE CLONE GSExo52 
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FIGURE 2E INTER MED I ATE CLONE 120: UNIDIRECTIONAL DECREASE 

IN INSERT LENGTH USING EXONUCLEA3ES HI AND VII 

STEP 



21 




.3.32* »■« -2.8kD. 
0.2S 

INTERMEDIATE CLONE 120 



Net 



Sal Hb W Complete Not I she 



22-32 




Complete Not l site 
— CGCDGGQG— - 



INTERMEDIATE CLONE 120ExoS5 



WO 94/23048 PCTAJS94/03784 

8/1 1 



FIGURE 2F COUPLING OF CLONE 66EXOS2 INSERT AND 

CLONE 12OEXO80 INSERT: SEPARATION OF LTR'S AND 
GENERATION OF POTENTIAL. INFECTIOUS CLONE 



STEP 




CLONE II 



X {USPTO) 
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Figure 3 
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